OATbyCO


: UAVs for the Environmental Sciences

UAVs for the Environmental Sciences

Methods and Applications

Inhalt

8

Preface

The application of unmanned aerial vehicles (UAVs) in environmental sciences has increased significantly in the last ten to fifteen years. We, as the editors of this book, have been and are still part of this development of increasing applications of UAVs in environmental studies, including the learning pathway. With the opportunities provided by UAVs, we are able to conduct research as never intended before. UAVs are causing a paradigm change in environmental sciences be-cause it becomes possible to observe the Earth surface nearly continuously at spatial resolutions that change our measurement perspective from samples to continuums. Because of the nature of a ‘flying sensor’ and the mostly light-weight and affordable equipment, challenging and more difficult to access environments, such as deserts, wetlands, cliffy coasts, or alpine areas can be monitored for the first time or more easily than before. Compared to other surveying methods, such as terrestrial laser scanning or airborne (i.e. by plane) remote sensing, data can be achieved much easier, faster, and mostly with larger coverage or with higher temporal and spatial resolu-tion. These advantages helped to establish a large group of researchers using UAVs for environ-mental studies, enabling them to observe processes, patterns, and changes, for the first time due to unprecedented spatio-temporal resolutions.

Using the same technology and working in the area of environmental sciences brought us together on several conferences, where we recognized that we are all stumbling at the same trip-ping stones and pitfalls, such as choosing a suitable number and type of ground control points or preparing a task-specific flight plan and choosing the correct sensor. Thus, we started to realize that a comprehensive overview and teaching book for the application of UAVs in environmental sciences, that can prevent other researchers from repeating the same mistakes, is still missing. This book elaborates on the fundamental basics of applying UAVs in environmental research, reaching from essentials in planning and preparing UAV flights, sensor systems, data collection and processing, data analysis to numerous examples of possible fields of application. We hope that this work, which was intended to be openly available from the beginning, can be of great support when working with UAVs in environmental research, directly achieving optimal data for the myriads of applications worldwide. During the course of editing this book, all of us were surprised by the numerous applications represented in this book, which never seem to end.

Lately, the term ‘unmanned aerial vehicle’ is in discussion because this term can be considered as a not fully inclusive usage of language. Therefore, it has been proposed to use ‘unoccupied aerial vehicle’ or ‘uncrewed aerial vehicle’, giving the opportunity to stay with the same acronym.

9

Also, the usage of RPAS (Remotely Piloted Aircraft System) has been suggested. However, the community, who uses UAVs in environmental sciences has not finalized its decision on what term to use. Thus, we decided to allow for the usage of all three possibilities to define UAV. Fur-thermore, we would like to state that company names, e.g. in regard of software, platforms or sensors are being used without specific recommendation.

We like to thank the authors for their contributions; without their input this book would of course not have been possible. The authors cover a vast variety of scientific background and ex-pertise spanning from engineering to photogrammetry, to geo-information science, to remote sensing, to geomorphology, to ecology, to hydrology and to geology, which very well underlines the diversity of the applicability of UAVs in environmental sciences. The book would also not have been possible without the support by the reviewers, who assisted our editing process. They considered the chapters thoroughly and commented very supportively and helped to improve the book. We would like to thank Simon Buckley, Görres Grenzdörffer, Sören Hese, Eliisa Lotsa-ri, David Mader, Berit Schmitz, Ellen Schwalbe and Christian Thiele for providing their support and expertise. Furthermore, we want to express our gratitude towards Luise Hofmann. We thank the German Research Foundation (DFG) for its trust into a group of early career scientists and its support of the scientific network that allowed for regular meetings to advance our joint book editing. Last but not least, we are very thankful for the support of the publisher: wbg – Wissen-schaftliche Buchgesellschaft.

Anette Eltner

Dirk Hoffmeister

Andreas Kaiser

Pierre Karrasch

Lasse Klingbeil

Claudia StöckerAlessio Rovere

10

1 Basics

11
12

1.1 Historical developments of UAV use in environmental sciences

Irene Marzolff

1.1.1 Unmanned aerial photography at the origins of remote sensing .........................................111.1.2 Developments in modern SFAP and UAV remote sensing techniques ...............................151.1.3 Terminology in UAV remote sensing today ............................................................................22Ultra-high resolution earth observation data as well as their derivatives have become ubiquitous in environmental research and spatial applications of all kinds. The proliferation of unmanned aerial vehicles, both professional and consumer-grade, together with new miniaturized sensors and 3D image-processing techniques has revolutionized centimetre-precision geodata acquisi-tion within just a decade or so. The concepts, applications and techniques of UAV remote sens-ing, however, go back a long way, building on nearly 150 years of unmanned remote sensing with various types of sensors and platforms. Kites, balloons, blimps, paragliders, model airplanes and model helicopters are among the most common vehicles employed before the advent of modern drones. Their characteristics vary greatly: tethered or free-flying, powered or unpowered, aero-static by buoyancy (lighter-than-air) or aerodynamic by means of fixed, flexible, or rotary wings (heavier-than-air), technically basic or highly sophisticated, endurance from minutes to hours, payloads from light to heavy. The choice of system may therefore be matched to a large range of operational, logistic, legal and financial conditions. This chapter traces the development of unmanned airborne remote sensing from its very beginning until today. 1.1.1 Unmanned aerial photography at the origins of remote sensing

For more than a century before the drone age, images of the Earth as seen from above have been taken with the aid of unmanned platforms by scientists, engineers and professional as

13

well as hobby photographers. Experiments with cameras attached to kites and balloons were made in the mid-19th century, as early as ten years after the invention of the daguerreotype, by Colonel Aimé Laussedat, a French engineer later considered the father of photogrammetry. More successful attempts at unmanned aerial photography using balloons were made in the 1860s to 1880s in America, Germany, and France (Aber et al., 2019). The kite, however, was to become the most widely adopted pilotless platform until the early 21st century for obtaining low-altitude images for the environmental and geo-sciences, for archaeological documentation and for landscape photography in arts and leisure. Tethered kites must have been the earliest aircraft in history and were already flown in China more than 2000 years ago. Their successful use as a platform for scientific measurements, even before the earliest kite airphotos appear, is documented in publications by British and American meteorologists (e.g. Archibald, 1884, McAdie, 1885).

In 1890, the French photographer Arthur Batut published a 70-pages booklet on aerial photography with a kite and simple wooden box-camera entitled La photographie aérienne par cerf-volant (Aerial photography by kite) (Batut, 1890; Figure 1.1-1). In this publication, a UAV photographer will find even today a surprisingly up-to-date documentation of the techniques, concepts and pitfalls of unmanned low-altitude aerial photography. Batut’s pro-gressive ideas on potential applications of such imagery include the use of a newly proposed method for measuring terrain heights from two overlapping photographs: This method was to become stereo-photogrammetry and thus one of the main applications for UAV imagery today (see chapter 2.2). Batut also advocated kite photography as a promising means for envi-ronmental monitoring, giving the example of mapping phylloxera infestations in vineyards. This was a pressing issue in late 19th-century France, where wine production had dramati-cally dropped to 25% following the introduction of this pest from America. The activities of Arthur Batut and Émile Wenz, another French pioneer of kite aerial photography, gained considerable attention in the press, and the method was soon taken up in North America (Beauffort & Dusariez, 1995). Spectacular photographs were taken by the Illinois photogra-pher George R. Lawrence, who used a 22 kg panoramic camera suspended from a train of kites to document the ruins of San Francisco after the devastating 1906 earthquake (Aber et al., 2019).

14

Figure 1.1-1: Wood-and-paper kite (approx. 2,5 m x 1,75 m) used by Arthur Batut for aerial photography since 1888. The wooden box camera is fitted with a simple lens and its shutter was triggered with a slow match that was lit before launching the kite.

Photography by A. Batut, 1890. Image credits: Collection Espace photographique

Arthur Batut/Archives départementales du Tarn.

After the invention of motor-powered airplanes by Wilbur and Orville Wright at the turn of the 20th century, the role of kites and balloons declined as more and more military and commer-cial aerial photographs were taken from planes. Technical developments in cameras, films and photo analysis developed rapidly during World War I and again during World War II, spurred by the need for military reconnaissance and accurate cartographic measuring and mapping. Consequently, early publications on aerial photography were dominated by technical aspects,

15

such as Herbert E. Ives’s handbook on Airplane Photography (Ives, 1920) or Hermann Lüscher’s Photogrammetrie (Lüscher, 1920). However, non-military communities engaged in document-ing, monitoring and interpreting landscape patterns and processes also benefited greatly from these developments during the first half of the 20th century. The value of aerial photographs in geography, geology, archaeology and other disciplines was examined in numerous publications of the 1920s (e.g. Hamshaw, 1920, Lee, 1922, Crawford, 1923, Ewald, 1924, Perlewitz, 1926). In his landmark paper on airphotos for landscape ecology studies, the German geographer Carl Troll strongly advocated the development of a systematic research method based on aerial photographs and highlighted their potential for viewing the landscape as a spatial entity (Troll, 1939).

After World War II, the expertise of military photographers and photointerpreters as well as surplus photographic equipment became available for furthering airphoto use in non-military and scientific applications (Colwell, 1997). Systematic aerial surveys and photogrammetric mapping became a standard task for land surveying agencies. The development of satellite remote sensing during the Cold War moved Earth observation into new dimensions yet again, with unmanned platforms in orbital altitudes beyond the airspace and digital scanning sen-sors providing a new type of imagery. This type of remote sensing data became accessible for civilian use with the beginning of NASA’s Landsat programme in 1972, which still continues today.

By the 1970s, these developments had made not only image acquisition, but also image anal-ysis by aerial photogrammetry and satellite image processing a professional task, carried out by trained specialists with access to dedicated and expensive equipment, hardware and soft-ware. Obviously, this presented a rather intimidating barrier to many research endeavours in the Earth, environmental and (cultural) landscape sciences. In many cases, projects could have benefitted greatly from using aerial photography for documenting and monitoring forms, pat-terns and processes. However, they often required practicable, cost-effective image acquisition methods for taking local-site images at very detailed scale at exactly the right time. The repeat rate of conventional aerial photography or the spatial resolution of satellite imagery – not to mention the considerable costs of such material – prevented their use in studies on small-area and often transitory features and on highly dynamic landscape processes, such as rill and gully erosion processes, coastal morphodynamics, field-based pest infection detection or local-scale vegetation encroachment.

It is therefore not surprising that, at the same time that satellite remote sensing advanced, low-altitude aerial photography employing low-tech platforms and sensors began to make a slow but definite comeback during the 1970s and ’80s, paving the way for today’s discipline of UAV remote sensing.

16

Beginning in the early 1970s, consumer-grade small-format cameras were increasingly used for taking airphotos from low flying heights in archaeology and cultural heritage studies, and also in forestry, agriculture, vegetation studies and physical geography. By the 1990s, the term small-for- mat aerial photography or SFAP (Warner et al., 1996, ASPRS, 1997), sometimes also low-altitude aerial photography (LAAP), had become established for a niche remote-sensing technique that was pursued by a small community of enthusiasts willing to face the technical challenges in-herent in using non-metric cameras for aerial photography in low flying heights. Studies using manned small aircraft were soon outnumbered by those using unmanned platforms such as kites, balloons, helium blimps and other non-conventional, often custom-built aircraft. In par-ticular, KAP ( kite aerial photography ) as a “sub-discipline” of SFAP became increasingly popular for scientific purposes (testified by numerous publications until today; e.g. Bigras, 1997, Boike & Yoshikawa, 2003, Smith et al., 2009, Aber et al., 2020), but also for leisurely and artistic purpos-es – similar to the coexistence of today’s scientific UAV remote-sensing community and hobbyist drone community. Kites are still valued as an unpowered alternative to UAVs in sensitive, windy, high-altitude or flight-restricted environments (Duffy & Anderson, 2016, Feurer et al., 2018, Wigmore & Mark, 2018).

17

Table 1.1-1: Most common constellations of sensors, platforms and image-analysis techniques for primary data generation used in unmanned aerial remote sensing since 1970. Dominant uses are formatted in italics.

Considering the 50 years since the revival of unmanned aerial photography in the early 1970s, an accelerating development in three sectors can be identified: sensors, platforms, and im-age-analysis techniques (Table 1.1-1). For three decades, the most common sensor-platform combination comprised 35 mm small-format film cameras suspended from tethered kites, balloons or blimps (Figure  1.1-2). These image acquisition techniques changed little from Ullmann’s plastic-balloon photography of raised moors (Ullmann, 1971) to the author’s own SFAP beginnings with hot-air balloons (Marzolff & Ries, 1997). Various types of radio-con-trolled rigs and mounts evolved for attaching the camera (or dual-camera stereo or multispec-tral arrangements) to the kite line, balloon or blimp (Aber et al., 2019). The technical chal-lenges at that time included intermittent surveys with film-roll changes every 36 pictures, no quality control before film had been processed, and constraints to size and nature of the survey areas, as tethered platforms require direct access to the site and limit the survey range both horizontally and vertically. For lighter-than-air platforms, access to propane, helium or

18

(more dangerously) hydrogen as lifting or fuelling gas was required – not all of them are read-ily available throughout the world. Manually navigated model aircraft began to complement the tethered platforms in the 1980s (Przybilla & Wester-Ebbinghaus, 1979, Koo, 1993). How-ever, they were less popular due to their strong vibrations, their technical complexity and the considerable pilot skills required.

Figure 1.1-2: Left: Tethered hot-air blimp used by physical geographers of Freiburg and Frankfurt

Universities for monitoring erosion and vegetation in northern Spain, 1996 (Marzolff, 1999). Right: Large rokkaku kite designed for lifting a SLR camera with sledge-type rig (see inset) in light to medium winds; here seen during aerial survey of a gully site in South

Morocco, 2006. Photographs by the author. Figure modified after Figs. 7-14A and 7-21 in Aber et al. 2019; copyright Elsevier – all rights reserved.

19

Figure 1.1-4: Left: Topographic map of Gully Oursi, Burkina Faso, created in 2002. The gully scarps, contour lines and height points were manually mapped as 3D vector data with ERDAS

StereoAnalyst using a stereo-model from scanned 35-mm slides that was viewed with active shutter glasses on a stereo computer monitor. Right: TIN surface model computed from the 3D points and lines. Adapted from Marzolff et al. (2003).

Figure 1.1-3: Scanned 35-mm slide taken with an analogue Pentax SLR camera fitted with additional fiducial marks for digital photogrammetric processing. This kite aerial photograph of archaeological excavations at Tell Chuera settlement mound, northeastern Syria, was taken by the author in 2003, the last year before digital cameras took over.

20

While sensors and platforms changed little during this phase, image-analysis techniques (which had been traditionally analogue during the 1970s and 1980s) shifted towards digital processing of scanned negatives or slides in the 1990s (Figure 1.1-3). The same image-processing tech-niques already established in satellite remote sensing (e.g. filtering, spectral transformations and ratioing, image classification) were applied to the digitized colour and colour-infrared images (e.g. Fouché & Booysen, 1994, Bürkert et al., 1996, Marzolff, 1999). At the same time, photo-grammetric analysis of SFAP remained mostly analogue, depending on access to professional equipment and expertise, and was predominantly used in archaeology and cultural heritage doc-umentation (e.g. Wanzke, 1984, Summers & Summers, 1994).

Although digital cameras were already available in the 1990s (above all, the KODAK DCS se-ries; e.g. Mills et al., 1996), camera price and image quality only began to compete with analogue photography in the mid-2000s. At the same time, softcopy photogrammetry became accessible to the non-specialist as hardware requirements and software prices decreased. Small-format dig-ital compact cameras and digital single-lens reflex (DSLR) cameras then quickly replaced 35 mm film in unmanned aerial photography, considerably speeding up image acquisition. The range of applications using digital image-processing techniques began to widen rapidly (e.g. Baker et al., 2004, Eisenbeiss, 2004, Hunt et al., 2005, Marani et al., 2006). Again, archaeology was the first discipline to take advantage of digital stereo-photogrammetric analysis with small-for-mat airphotos (e.g. Karras et al., 1999, Altan et al., 2004). Its use in geomorphology – although this is the geoscientific discipline most interested in 3D forms – remained rare (Marzolff et al., 2003, Marzolff & Poesen, 2009, Smith et al., 2009) due to the considerable photogrammetric expertise required, the high degree of manual stereo-mapping involved (Figure 1.1-4) and the comparatively low quality and density of point clouds extracted by automatic image matching (Figure 1.1-5).

Analogue film cameras had become largely obsolete by 2010, but digital analysis of SFAP remained a challenge: image-processing and photogrammetry software was still devised for far lower-resolution satellite imagery and (scanned) metric large-format airphotos with pre-cise camera calibration, abundant ground control and highly regular vertical image-acquisition schemes – none of it typical for the output of an SFAP survey with a kite and compact camera on a windy day in the field. Also, geodata formats available for storing and analysing continuous surface information – TINs, raster elevation models – did not allow to model true 3D surfaces, but what is often termed 2.5D (one z value only per x/y location (chapter 3.4); Aber et al., 2019). By the mid to late 2000s, the potential of SFAP in high-resolution topographic data acquisition seemed exhausted. As the support hotline of a market-leader geospatial software company put it to the author in 2004: “I fear that, given the special nature of your data, you have reached the limits of feasibility.”

21

Looking back, this quotation marks the end of the “pre-UAV” and “pre-SfM” remote sensing era. From about 2005 onwards, GPS/INS flight-stabilization systems and autopilot flight con-trollers found their ways into model aircraft (e.g. Hardin & Jackson, 2005), upgrading them to unmanned aerial vehicles. Before long, the rapid advancement both of platforms and software pushed the aforementioned limits of feasibility far out of sight. The literature on UAV-based re-mote sensing for environmental applications has been growing exponentially since 2010 (Simic Milas et al., 2018; Aber et al., 2019). Major innovations in all three sectors listed in Table 1 are decisive for this development, which was also reviewed by Colomina & Molina (2014), Cum-mings et al. (2017), Pajares (2015), Manfreda et al. (2018), Yao et al. (2019) and Tmušić et al. (2020):

Figure 1.1-5: Left: Comparison of five point clouds built by state-of-the-art photogrammetry software in 2004 and 2020 (note logarithmic scale of y axis). Film transparencies (scanned with

1800 dpi; A & C) and simultaneous digital camera images (B, D & E), all with GSD 2.3 cm, were processed with traditional photogrammetry (Leica Photogrammetry Suite; A & B) in 2004 and again with Structure-from-Motion photogrammetry (Agisoft Metashape; C, D & E) in 2020. The highest possible point-cloud resolution in LPS was 1 point/(3*GSD)² = 210 points/m², but gaps in surface reconstruction, particularly in shadowed and textureless areas, resulted in much lower average densities for film, and also for digital images (models A–B). In contrast, the 120 point/ m² target density (= “medium quality” tier of 1 point/(4*GSD)²) is easily surpassed by current

SfM algorithms, which perform nearly identical for film and digital sensor (C & D). Model E is computed with SfM “ultrahigh quality” (1 point/1 GSD²). Photographs taken by the author with analog and digital Canon EOS cameras in 2004 at Gully Negratín 3, Southern Spain.

22

Platforms : The recent technological advancements and price decline of GNSS and INS-based navigation and flight-control systems for autopiloted unmanned aerial systems (chapter 1.3) have made drones of all varieties the prevalent means of Earth observation from low heights. They are employed not only for geo- and environmental research, but also in a wide range of civil, commercial and governmental applications concerned with surveying, mapping, mo-nitoring, inspection and surveillance. Fixed-wing, multi-rotor and hybrid VTOL (vertical take-off and landing) UAV in a wide range of prices, sizes and technical configurations are now available on the consumer and professional market (van Blyenburgh, 2018). Of these, the micro (or small) and mini UAV class, with maximum take-off weights of 5 or 25 kg, re-spectively, have become the most common in scientific use. Unlike traditional model aircraft, UAVs fly autonomously or in semi-automatic mode (where the human pilot is assisted by the flight-control system). By the mid-2010s, professional-grade UAV with high-precision RTK/PPK GNSS (real-time or post-processing kinematic global navigation satellite systems) beca-me available, so ground control for georeferencing may be reduced or omitted (chapter 2.1). Platform hardware is complemented by a large choice of flight-planning and ground-station software. Although legislation amendments lag behind this development of UAV technology, unmanned aircraft have come increasingly under the control of airspace regulations – an ongoing development challenging UAV use for research in many countries (see chapter 1.4).

Sensors : In addition to an ever-increasing choice of RGB cameras suitable for UAVs (Aber et al., 2019), small-format to miniature multispectral sensors for near-infrared and short-wave infrared wavelengths and subsequently hyperspectral sensors have become widely available (Yang et al., 2017, Manfreda et al., 2018). Passive sensors specifically designed for UAS inclu-de fully integrated on-board mini-cameras with electronic gimbals, mirrorless interchange-able-lens cameras (MILC) without display and viewfinder, multi-sensor arrays with synchro-nized monochrome sensors sensitive to red, red-edge and near-infrared wavelengths, and combined visible and thermal imaging sensors (see chapter 2.4 and 2.5).

Image-analysis techniques : The development of new and often open-source software cou-pling photogrammetric principles with computer-vision concepts and algorithms – specifi-cally Structure from Motion-Multi-View Stereo or SfM-MVS (Smith et al., 2016, Eltner & Sofia, 2020; see chapter 2.2) – has revolutionized high-resolution 3D geodata acquisition and orthophoto generation in terms of speed, ease and cost-effectiveness. The main differences to classical photogrammetry are the specific focus on non-metric, small-format cameras, the higher flexibility regarding scales, image schemes and image orientations, the multi-view ste-reo approach, more powerful image-matching algorithms (see Figure 1.1-5), the possibility of creating 3D models without ground control and a higher degree of work-flow automation. A typical UAS image-processing software now includes automatic bundle-block adjustment, extraction of 3D point clouds, interpolation into 2.5D DEMs or 3D meshes and creation of

23

orthophoto mosaics as well as – NIR imagery provided – vegetation index maps. UAV data-processing modules have also become available for leading GIS and image-processing soft-ware packages. In addition, numerous software tools not specifically designed for UAV data facilitate advanced analyses of UAV-based 3D point clouds, meshes and (ortho-)imagery (see also chapter 3.2–3.5). Automated co-registration and image tracking approaches have begun to turn UAV remote sensing 4D, adding time to space for monitoring dynamic changes of soil, water and ice (e.g. Turner et al., 2015, Jouvet et al., 2018, Pinton et al., 2020).

The developments of the last decade have also seen an increased integration of these three sec-tors. For example, small quadcopter UAS with dedicated flight-control software connected to black-box cloud-processing services may allow a user with next to no specialist knowledge to conduct an aerial survey and generate decent DEMs and orthophotos all within an hour. While the quality and accuracies of these quick-and-easy products is limited and certainly not suit-able for all research questions, many simpler applications may not require more precise and advanced data. 1.1.3 Terminology in UAV remote sensing today

The term unmanned aerial vehicle or UAV appeared in scientific publications on Earth and environmental studies around 2005, corresponding to the technical developments outlined above. During the first years of transition, there remained some indecision whether the “vehi-cle” should also include traditional tethered platforms such as kites and balloons (e.g. Eisenbeiss, 2009), but definitions of “UAV” in the research literature soon agreed on free-flying, powered aircraft that may be flown remotely by a pilot on the ground or programmed to fly autonomously along specified routes to designated waypoints. Nevertheless, a confusing variety of terms and acronyms exists next to UAV, for which the following summary is based on an overview given by the author elsewhere (Aber et al., 2019). For a thorough review of the origins and chronology of terminology, the reader is referred to Granshaw (2018).

Undoubtedly, “drone” is the colloquial term most commonly used in everyday language for a small aircraft without an on-board human pilot. This term was originally introduced in the 1940s as the official US Navy designation for unmanned target aircraft (Granshaw, 2018), but has been unpopular by many civilian users of UAV due to its association with often debated mil-itary operations. This ambivalent connotation of the term has faded away more recently as small consumer-grade quadcopters, in particular, have become ubiquitous in non-military uses of all kinds. Drone is now the preferred term in popular scientific contexts, governmental applications and leisure activities.

24

In academic and professional usage, however, the most common terms remain UAV or UAS (unmanned aerial system or unmanned aircraft system), which includes remotely piloted and autonomously navigated aircraft. In little-used varieties of the term, the U may also stand for unpiloted or uninhabited. In the more regulatory context of the US Federal Aviation Admin-istration (FAA) and the European Aviation Safety Agency (EASA), the “aircraft system” rather than “aerial” or “vehicle” is preferred, as the aircraft component stresses the need for airworthi-ness, and the system includes ground-control stations, communication links, and launch and retrieval operations in addition to the vehicle (Dalamagkidis, 2015). Other common terms are RPA (remotely piloted aircraft), RPV (remotely piloted vehicle) and RPAS (remotely piloted aircraft system). These are seen as distinctive from UAS by the International Civil Aviation Organization (ICAO), as the latter includes fully autonomous aircraft not allowing pilot in-tervention, which are primarily used in military contexts (ICAO, 2015; Granshaw, 2018). The term RPAS is most commonly used in contexts of explicitly civilian aviation regulation. It is worth noting that none of the currently valid definitions by regulatory agencies – not even those addressing the “system” (e.g. EASA, 2009) – includes any reference to cameras or other sensors carried by the unmanned, remotely or autonomously piloted aircraft. To the scientif-ic communities engaged in Earth and environmental research, however, carrying the sensors used for geospatial data acquisition clearly is the main purpose of these platforms, whichever term is used for them.

One and a half centuries after its beginnings, unmanned aerial remote sensing has reached an unprecedented degree of automation from image acquisition to finished geodata product – and also an unprecedented range of sophistication from simple visual interpretation of micro-drone airphotos to multi-sensor, artificial-intelligence and high-precision approaches for investigat-ing, amongst many others, detailed soil-surface and riverbed structures (Onnen et al., 2020; Mandlburger et al., 2020), machine-learning classification of trees (Xu et al., 2020) or modelling of canopy thermal emissions (Bian et al., 2021). Many of the questions we strive to answer as Earth and environmental scientists, however, have been part of this history all along – even though Arthur Batut, flying a kite with a wooden box-camera over the destroyed vineyards of his home region in 1890, could never have imagined the deep-learning segmentation approaches used for vine-disease detection by his French colleagues of today (Kerkech et al., 2020).

References for further reading

26

1.2 Comparing UAV to other remote sensing techniques

Alessandro Matese

1.2.1 Overview of remote sensing platforms ....................................................................................261.2.1.1 Satellite ............................................................................................................................261.2.1.2 Aircraft ............................................................................................................................261.2.1.3 Unmanned Aerial Vehicles ..........................................................................................27

1.2.2 Comparison of UAVs to other remote sensing platforms .....................................................281.2.2.1 Strengths .........................................................................................................................301.2.2.2 Weaknesses.....................................................................................................................33Since the beginning of 1900, various platforms carried cameras mounted to collect images, to-day satellites and UAVs acquire most of the data collected remotely. The platforms indicate the structures or vehicles on which the remote sensing instruments are mounted. Thanks to several platforms located far from the target, remote sensors can collect a large amount of data in a short time, ensuring rapid data acquisition even in large areas. The remote sensing platform must be able to support the weight of a sensor, remain at a given altitude, remotely take a series of images at a specific time and then return those images for different applications. The po-tential for environmental remote sensing using these platforms has been effectively supported by many authors, the purpose of this chapter is to define what types of data/accuracies can be achieved with UAV vs. remote sensing and the pros/cons of UAV vs. satellite and aircraft-based platforms.

27

1.2.1 Overview of remote sensing platforms 1.2.1.1 Satellite

Remote sensing from space-based orbital platforms for information collection had its begin-nings in early 1960s, when it first became possible to place cameras in polar earth orbit to remotely photograph any point on the globe on a routine and predictable basis (Pabian, 2015). Among the first applications for satellite-based imagery collection was for military purpos-es. Landsat satellite (1972) by US Government was the first open-source project provided the first publicly accessible imagery from space. The Landsat-1 satellite carried digital scanning sensors covering four multispectral bands that provided a spatial resolution of 80 m, while Landsat 7 (1999) and Landsat 8 (2013) in addition to having eight-multispectral 30 m bands, and two thermal 100 m infrared bands, also has a 15 m resolution panchromatic band. French SPOT-1 satellite launched in 1986 provided 20 m multispectral and 10 m panchromatic ground resolution. In early 2000, Ikonos, capable of providing electro-optical imagery at a resolution of less than 2 m, reaching up to 31 cm with WorldView-3 with the sharpest imagery currently available.

A revolution in terms of accessibility was made by the Copernicus programme of the Europe-an Commission (EC), where the European Space Agency (ESA) launched in 2017 the Sentinel 2B mission acquiring high spatial resolution (10 to 60 m) optical imagery. The free, full and open data policy adopted for the Copernicus programme foresees access available to all users for the Sentinel data and offers an unprecedented combination of systematic global coverage of land and coastal areas, a high revisit of five days under the same viewing conditions, high spatial resolution, and a wide field of view (295 km) for multispectral observations from 13 bands in the visible, near infrared and short-wave infrared range of the electromagnetic spectrum (Drusch et al., 2012). 1.2.1.2 Aircraft

In recent years, the advent of UAVs has overshadowed the use of aircraft for many of the remote sensing activities. Although they are still widely used for large-scale monitoring for land-use and inspection purpose by public and private institutions, their peculiarities lead them to be considered a “middle way” between satellite and UAV. The main strength remains the payload capacity. In fact, they can carry much heavier sensors than UAV, such as LiDAR and a combination of sensors of various nature. A UAV that can only carry one

28

sensor at a time would have to go through multiple passes, thus increasing flight time and processing time, while a manned aircraft carrying multiple sensors could collect all data in one pass.

Aircrafts have many restrictions on their use, they must obtain airspace permits, plan ade-quate take-off and landing points, and comply with ever-changing flight restrictions. Finally, as UAVs continue to improve, it will fly longer, withstand higher wind speeds, and carry more sophisticated payloads, the overlap between their mapping capabilities and those of manned aircraft will increase. The next huge increase in the number of UAVs and their applications will come when national regulatory bodies will allow flight beyond line of sight (BVLOS) in controlled airspace. 1.2.1.3 Unmanned Aerial Vehicles

The initial use of UAV systems and platforms was inspection, surveillance and mapping of military areas followed by geomatic applications. UAV photogrammetry opens up several new applications in the short-range aerial field and also introduces low-cost alternatives to classical manned aerial photogrammetry (Colomina et al., 2008). This development can be explained by the diffusion of low-cost platforms combined with RGB digital cameras and GNSS/INS systems, necessary to navigate the UAV with high precision to the predefined ac-quisition points. The small size and low payload of some UAV platforms limit the transport of high quality IMU devices such as those coupled with aerial cameras or LiDAR sensors used for mapping. Simple, hand-launched UAVs operating autonomously using its autopi-lot with GPS and, in general, an IMU sensor, are the most economical systems, although platform stability in windy areas could be a problem (Nex & Remondino, 2014). More sta-ble systems, usually with a petrol engine, with a higher payload allow a more professional camera on board or even detection with LiDAR instruments. Typical domains were UAV images and 3D data derived from photogrammetry or orthoimagery are generally used in agriculture with the aim to produce maps with high spatial resolution for precision agricul-ture applications support agronomic decision in different areas of the field (Matese & Di Gennaro, 2018). Assessments of woodlots, fires surveillance, species identification, volume computation and tree detection are the main applications in forestry (Wallace et al., 2012). Environmental surveying for land and water monitoring are also feasible. A large number of applications in the archaeology and cultural heritage domain exist, where 3D mapping of sites and structures are easily achieved with a low-altitude image-based survey (Remondino et al., 2011).

29

1.2.2 Comparison of UAVs to other remote sensing platforms

Recent advances in UAV technologies have produced alternative monitoring platforms that offer the opportunity to acquire spatial, spectral and temporal information in a wide range of applica-tions at a relatively low cost. They offer high versatility, adaptability and flexibility compared to other remote sensing techniques such as satellites or aircraft due to their potential to be rapidly and repeatedly used for high spatial and temporal resolution data. Despite the recent and rapid increase in the number and scope of satellites, the temporal resolution and availability of current satellite sensors with very high spatial resolution are neither sufficient nor flexible for many re-mote sensing applications especially in forestry and agriculture. Moreover, most of the satellites are managed by commercial organizations and the cost of the images can be high if short survey times are required. Aircraft can provide both high spatial resolution and rapid revision times but their use is limited by operational complexity, safety, logistics and costs, it becomes feasible only on medium-sized areas and remains largely run by commercial operators, even if some countries do not have aircraft for this kind of acquisitions and remote areas are also difficult to reach. In a comparison of the three monitoring platforms, UAVs are an economic technique on limited areas (5 ha), while for larger dimensions (50 ha), aircraft or satellite platforms can be more effective options. Obviously, the regulations limit the economic advantages linked to their use and some potential applications, even if operational adjustments are in the process of being evaluated that will certainly facilitate their use in the coming years.

The vulnerability of UAVs to weather conditions (i.e. wind, rain) that can alter the monitoring quality is certainly a negative aspect even if the other platforms are neither immune to weather conditions in terms of operability and data quality (e.g. cloud coverage for satellites). One of the aspects that directly affects the area that can be detected is the limited flight times of UAVs due to the payload and battery power supply. Most UAVs are powered by electric batteries, others by combustion engines that use gasoline as fuel. The actual flight time for a single monitoring may not be sufficient for a given application and careful mission planning is therefore required. However, this problem is currently solved by planning that allows the management of multiple flights. Advances in the field of hardware technology offer new solutions that will extend the flight duration up to two hours, making the use of UAVs more competitive.

The recent and rapid developments in sensor miniaturization, standardization and cost reduc-tion have opened up new possibilities for UAV applications, there are a large variety of sensors available for UAVs, from RGB cameras to multispectral and hyperspectral cameras, thermal cameras, GNSS RTK, IMU and LIDAR. The most common sensor is the RGB camera that takes high quality images for interpretation or photogrammetry. Small multispectral cameras with

30

bands in the near-infrared range can be mounted on UAVs. Thermal infrared sensors are com-monly used for inspections in urban areas but also in agriculture, while LiDAR sensors are avail-able for use on UAVs for both urban and forestry purposes. Most UAVs provide real-time video transmission to the remote-control point, so that the operator can accurately track the flight.

The most common difficulties related to the acquisition of UAV imagery range from image blurring due to forward movement of the platform, resolution impacts due to variable flight height, orthorectification problems due to geometric distortion associated with inadequate im-age overlap and spectral effects induced by variable lighting during the flight, just to mention the main problems.

It is therefore essential to consider the best practices in mission planning and the sensor’s configuration/setup before the flight to bypass the previous issues. It is then opportune to con-sider the various corrections and calibrations, radiometric, geometric and atmospheric before the mosaicking, georeferencing and orthorectification procedures. Together, these aspects are crucial for data acquisition and post-processing, which provide the necessary starting point for subsequent application-specific analyses. However, despite the existence of consolidated work-flows in photogrammetry devoted to aircraft or satellite acquisitions, UAV systems introduce various additional complexities, which until now have not been fully addressed.

The high spatial resolution of UAV data generates a strong demand for data storage and data processing capacity that results in the need to implement work-flow procedures for pre- and post-processing. In fact, if for satellite applications they are generally associated with a process-ing chain that guarantees the final data quality, in the case of UAVs all this is left to the end user. For a profitable data-processing workflow, it is necessary to consider the whole computational chain from raw images to final products, allowing a better comparison of the three remote sens-ing platforms. The strengths of the UAV acquisition are obviously in the highest resolution and precision, but at the cost of a greater effort for the mosaicking and geocoding.

Large amounts of digital data can be acquired on a single UAV flight. Collecting overlapping images of reasonable size over an area of 10 ha can result in thousands of individual images that need to be processed. Also, in order to obtain good results in terms of post-processing, overlaps of images are required, sometimes greater than 80 %.

The problem does not concern computer storage but the processing phases, especially the mo-saic, which requires computers with excellent computational features, especially for RAM and GPU. Obviously, this is if a minimum operating time for the production of results is needed, oth-erwise good quality processing can be done even with slower computational times. Furthermore, while cloud service providers eliminate the need to think about hardware and software, particu-larly image processing, there is still a bottleneck regarding the image upload and return times.

Most of the processing activities, in particular the new algorithms for image processing and computer vision, are developed as software libraries that are user friendly but not easy to mod-

31

ify. In agricultural applications (chapter 4.7), especially the radiometric correction remains a very delicate aspect, the multispectral sensors acquire RAW images in DN (Digital Number) which must then be converted into radiance and subsequently into reflectance to be used in the calculation of various vegetation indices. Usually, irradiance sensors connected to the UAV are used to convert directly to reflectance, or empirical calibrations are performed using Lambertian panels with known reflectance placed on the ground before the flight.

Flight or mission planning is the first essential step for UAV data acquisition and has a profound impact on the acquired data and processing workflow (chapter 1.5). Similar to oth-er remote sensing approaches, a set of parameters must be considered before flying, such as platform specifications, extent of the study site (area of interest), terrain sampling distance, payload characteristics, topography, study objectives, weather forecasts and local flight reg-ulations.

As for costs, of course, the additional advantage of the UAV platform is that the temporal resolution is limited only by the number of flights (power supply/battery capacity), so any cost equivalence is quickly exceeded due to repeatability. The costs for acquiring UAV data are gen-erally derived from the initial investment, processing software, data storage and associated field-work costs. However, after the initial investment, the data sets can be supplied more often and with a higher resolution than any other system. In comparing the acquisition and processing costs of the three different platforms (UAVs, aircraft and satellites), UAVs are identified the most economical solution for fields of 20 ha or less (Matese et al., 2015). A NDVI map (Normalized Difference Vegetation Index) derived from UAVs on a 5 ha field costs approximately, 2000 €, while on larger areas, the costs of acquisition, georeferencing and orthorectification have a neg-ative impact on the costs of images derived from UAV.

Regarding the cost of satellite images, a wide spectrum ranges from free images of ESA with a maximum spatial resolution of 10 m, to satellite images with a resolution of 1 m but which are prohibitively expensive. Of course, it is not an equivalent evaluation to compare these platforms on an image-by-image basis, as it is the richness of the spatial and temporal resolution of UAV systems that makes their application so flexible. In addition to allowing the high resolutions re-quired for many applications, sensors mounted on UAVs have numerous other advantages that are fundamental in a wide range of applications, providing quick access to environmental data, offering the near real-time functionality required. 1.2.2.1 Strengths

The most direct and important advantage of UAVs is the ability to acquire high-resolution im- ages , which, depending on the flight altitude and sensor spatial resolution, can reach a ground

32

sampling distance (GSD) of few millimetres. GSD is the distance between two consecutive pixel centers measured on the ground. The bigger the value of the image GSD, the lower the spa-tial resolution of the image and the less visible details. Using these high resolution and photo-grammetric software is possible to develop very high resolution 2D orthomosaics and 3D point clouds (Figure 1.2-1).

Figure 1.2-1: 3D point cloud of a vineyard developed using UAV imagery.

All images were prepared by the authors for this chapter.

Another positive aspect of UAVs is ease of management . Using new control technologies UAVs can be managed by users with relatively minimal experience. Furthermore, they present much more manoeuvrability when flying in areas that are difficult to reach and at low altitudes. In addition, there are platforms on the market with “open” technology for rapid prototyping and there is therefore the possibility of designing and implementing platforms with different types of sensors, also integrated (Figure 1.2-2). An aspect to be taken into account is the pilot certificates, in Europe, theory and practical training followed by an aeronautical tests and medical assess-ments are required to obtain it. Even if the skills needed to became a pilot in terms of technical and aeronautical background knowledge are prerequisites, the perfect knowledge of regulation is mandatory. A pilot needs to perfectly know the operational limitation, risk management and administrative procedures to avoid incidents and failures.

33

Figure 1.2-2: UAV equipped with different cameras for precision agriculture applications.

Concerning operational times and costs , UAVs can acquire data very quickly as regards the mission planning and implementation times. As to the costs there are now very cheap platforms on the market, but also service companies that operate in all disciplines at relatively inexpensive charges per surface area. Moreover, it is easy to repeat the monitoring on the same area at dif-ferent times to capture any changes. Using very powerful and inexpensive processing software it is also possible to deliver the processed data within tight deadlines, useful for example for supporting decisions in agriculture or forestry (chapter 4.7 and chapter 4.4).

Poor weather conditions , can result in reduced visibility, loss of communication, or loss of control. The influence of the wind on the UAV behaviour and onboard energy limitations are important parameters that must be taken into account, in fact, wind and turbulence play the largest role in aviation weather accidents. The manual or semi-manual piloting of a UAV has proven to be tiring and stressing due to the constant need to compensate for perturbations due to meteorological phenomena, often reducing the quality of acquired image blocks (i.e. irregular overlaps). The major ways in which wind affects UAV include changing the flight trajectory, limiting control and reducing battery life. Extreme temperatures have negative implications for the physical components of an aircraft as well its aerodynamic performance. Precipitation affects UAVs in a variety of ways. Just as with fog and high levels of humidity, precipitation can reduce visibility and damage electronics. UAVs are very flexible in terms of cloud coverage, even if some sensors, for example, multispectral sensors for agricultural application require light conditions suitable because in the most of cases are passive optical sensor that measure crop radiance.

UAVs equipped with GPS can be precisely programmed and piloted in exact positions, better if using RTK technologies. This is particularly useful in precision farming, where UAVs are used

34

for a variety of needs such as spraying fertilizers and pesticides, identifying weed infestations and monitoring crop health (Figure 1.2-3). There are also models on the market that allow very large tanks for a greater operability of UAVs.

Figure 1.2-3: UAV prototype equipped with a tank for spraying application. 1.2.2.2 Weaknesses

The greatest negative aspect in the use of UAVs is the limit of spatial coverage with a single flight and therefore the flight time and autonomy . Although fixed wing UAV can reach flight times over one hour, the multirotor UAV have a short flight time of 20 minutes to one hour, which limits the surface monitored.

An important example is a geometrical compromise between altitude and flight coverage area, which results from the sensor’s field of view (FOV) and the limited UAV autonomy in terms of

35

energy, that determines its flight time. The higher the altitude, the greater the ground coverage. At the same time, higher altitudes lead to less images per unit area to be processed but low GSD and therefore the details that can be detected by the images. Obviously to optimize the efficiency of the UAV detection it is a good idea to plan an accurate flight mission considering the flight altitude, weight of the payload and batteries used.

The greatest advantage in using multi-rotors is the possibility of having a gimbal on which the different sensors are installed, this allows stabilization of the sensors during flight, but in the same time small UAVs can only carry light sensors and this is a limiting factor both on the type of sensors that can be installed and on the flight times. Indeed, the choice of payload had to taking into account the autonomy (battery) and thus the flight time, also the flight stability.

Since the widespread use of UAVs is relatively new, legislation is still in the process of pro-viding regulations, although legislation is already in place in more than 50 % countries of the world, that are both capable of maintaining security but also of allowing UAV operators to work flexibly. In fact, most countries already adopted risk-based approaches and provide good frame-works to pursue safe UAV flights.

The UAV technology continues to improve and with it also the software used to process the acquired data. However, the fact remains that the large amount of data acquired, compared to satellite images for example (considering the same area to survey), requires high computation times and processing power.

Table 1.2-1: Comparative characteristics for different remote sensing platforms (+ positive, - negative, ns not significant).

References for further reading

38

1.3 Technical basics of UAVs

Jérôme Ammann, Philippe Grandjean, France Floc’h, Stéphane Bertin, Marion Jaud,

Pascal Allemand, Nicolas Le Dantec, Christophe Delacourt

1.3.1 Types of UAVs .............................................................................................................................381.3.1.1 Rotorcraft: multirotor and helicopter UAVs ..............................................................381.3.1.2 Fixed-wing UAVs ..........................................................................................................401.3.1.3 Hybrid and multipurpose UAVs .................................................................................41

1.3.2 Components.................................................................................................................................431.3.2.1 Autopilot .........................................................................................................................461.3.2.2 Communication system ...............................................................................................501.3.2.3 Motorization ..................................................................................................................511.3.2.4 Internal and external safety ..........................................................................................531.3.2.5 Adaptation of the payload on the UAV ......................................................................56

1.3.3 Operation .....................................................................................................................................571.3.3.1 Flight modes ..................................................................................................................571.3.3.2 Mission planning ...........................................................................................................58UAV or drone technology is nowadays increasingly plug and play, ready to fly (RTF), and af-fordable. It makes airborne platforms available to all, opening up new possibilities for research, observation, and data acquisition. The number of UAV models available on the market is in-creasing rapidly. It is therefore essential to correctly define requirements in order to choose the right model.

This chapter describes the technical basics necessary to understand the fundamental charac-teristics of UAVs. Here we will discuss only those UAVs with take-off weights between 1 kg and 25 kg and capable of carrying scientific payloads up to 5 kg , with detailed descriptions also provided in the following chapters. These types of UAVs are subject to specific regulations and require a drone pilot licence to operate them (see chapter 1.4 for more details on regulations).

39

This chapter is divided into three parts: (1) different types of UAVs, (2) their main components, and (3) flight operations. 1.3.1 Types of UAVs

There are two main types of UAVs: firstly, those equipped with propellers providing lift and thrust (helicopter and multirotor) and, secondly, those equipped with fixed wings and a propeller for thrust. There are also ‘hybrid’ UAVs inspired by both types. This section provides an overview of the different types of existing UAVs. Table 1.3-1 summarizes the advantages and disadvantages of each type. 1.3.1.1 Rotorcraft: multirotor and helicopter UAVs

Multirotor and helicopter UAVs are drones lifted by propeller rotors, the rotors being located on a horizontal plane. A helicopter UAV is a rotorcraft with one or two rotors, and can be powered by an electric motor or an internal combustion engine. A multirotor UAV can have four, eight or twelve  rotors, which are exclusively electrically powered. Rotorcraft UAVs have two  specific flight characteristics that differentiate them from other types of drones:

• Vertical Take Off and Landing (VTOL) for reduced spaces (cities, cliffs, etc.).

• Hover mode, offering the possibility of 360° observations from a fixed point, typically used for the monitoring and inspection of specific sites at close range (engineering structures, cliffs, forests, farmland) and for activities requiring contact or sampling (gas, rock, water, etc.).

The main criterion when selecting a multirotor UAV is the payload carrying capacity according to scientific requirements and flight autonomy. The latter is difficult to precisely determine, with ob-servations in the field that often differ from manufacturer values. Actual flight autonomy depends on the energy capacity of the batteries, the total take-off weight1 (including empty UAV weight, battery weight and payload weight), the flight scenario to be carried out2 and the wind condition.

Flight autonomy is specific to each UAV; it can be refined with flight experience and optimised with suitable battery management (see chapter 2.2). At the end of a flight, it is essential to ensure a margin of 15–20 % of remaining battery capacity. Indeed, the autonomy curves depending on payload weights provided by manufacturers constitute a maximum limit under optimal condi-tions.

There are many RTF (ready to fly), compact and light (< 2 kg) multirotor UAVs on the market today (Figure 1.3-1A). They usually carry a single pre-integrated sensor (digital, thermal, or mul-tispectral camera). They are mass-produced drones for the general public, their cost becoming affordable to specialize them around one type of sensor. The priority is thus on simplicity of use and piloting for general public drones. They are equipped with pilot assistance functions such as automatic take-off and landing, return path memory with obstacle avoidance, and propeller protection. Their light weight and small size make them suitable for use in urban environments.

Figure 1.3-1: UAVs in use at IUEM’s Ocean Geosciences Laboratory. (A) Compact RTF multirotor

UAV (DJI Phantom) with an integrated image sensor. (B) Custom-built multirotor UAV in flight with its gyro-stabilized hyperspectral sensor payload in a waterproof case. (C) Custom-built helicopter UAV (190 cm rotor diameter) equipped with side instrument pods (in white), a Reflex camera, and a thermal camera synchronized with a RTK-GPS. (D) SenseFly eBee commercial

RTF fixed-wing UAV with an integrated image sensor. Image credits: (A) Jérôme Ammann LGO, CNRS – UBO. (B) Christophe Prunier LGO, CNRS – UBO. (C) Philippe Grandjean

Univ-Lyon1. (D) Mouncef Sedratti LGO, CNRS – UBS.

41

However, these small drones are not adequate for all the scientific requirements outlined in this book, particularly when the payload is comprised of complex instrumentation such as several sensors (e.g., LiDAR, hyperspectral camera, thermal camera, various probes, sample collector, etc.). Currently, there are few RTF drones capable of carrying 3–5 kg of payload; this would oth-erwise incur very limited autonomy. Multi-sensor multirotor UAVs (Figure 1.3-1B) are thus spe-cifically designed around the scientific payload and mission-specific requirements. Some UAVs now permit a take-off weight of up to 25 kg. They are generally produced in limited or medium series, or can be custom-built, depending on users’ needs. The cost is thus higher compared to an RTF drone.

Caution : not all RTF or custom drones offer the same reliability. It is important to consult feedbacks on specialised forums and to think in terms of the quality of the components used.

The advantage of the helicopter UAV (Figure 1.3-1C) is that it is a versatile carrier. It is possible to mount one or more sensors of different types (digital, thermal, or multispectral camera, and sampling system) without having to redesign the whole system or to add addi-tional power. The flight autonomy of the helicopter UAV is dependent on its motorization and more particularly on the rotor-motor or rotor-engine energy efficiency, with internal combustion engines remaining the most efficient in terms of autonomy for this type of UAV.

Example: for the same helicopter UAV, one hour of level flight will require 1.5 litres of fuel (fuel weight: 1.2 kg) for an internal combustion engine, twelve litres of fuel (fuel weight:10 kg) for a turboshaft engine, and 8 kg of Lithium-Polymer batteries for an electric motor.

While the helicopter UAV equipped with an internal combustion engine appears to be more energy efficient and offers a level of autonomy of over one hour, it is noisy and produces un-wanted vibrations, against which payloads must be protected. The assembly of the UAV and its permanent tuning require the expertise of a qualified technician. It is often an artisanal drone of custom design or, more rarely, small series production. It is also relatively complex to fly a heli-copter UAV, meaning that they require the experience of a qualified drone pilot. The helicopter UAV is not a ready to fly drone. 1.3.1.2 Fixed-wing UAVs

Fixed-wing UAVs (Figure 1.3-1D) are very aerodynamic and fast. One reason being that the payload is integrated into the fuselage, which considerably reduces drag. Thanks to their aerody-namics and large lift-over-drag ratio (vertical/horizontal speed ratio in glide), fixed-wing UAVs

42

have the ability to fly for a long time (autonomy generally ranging from one to several hours). On the other hand, unlike multirotor, they are not able to hover (i.e., to stand still in the air).

The flight principle of a fixed-wing UAV is constrained by the wing loading . This is the ratio between the weight of the UAV and the surfaces of the air foil (wings, stabilizer), the angle of attack that the wing forms with the level flight position (with the fuselage horizontal), and the flight envelope comprised between the stalling speed and the never-exceed speed, the latter inducing a risk of disintegration of the UAV if exceeded. The normal operating horizontal speed must be well above the stalling speed to ensure a safe flight.

Like multirotor UAVs, there are light (< 2 kg) and compact fixed-wing UAVs of the RTF type (Figure 1.3-1D), equipped with a single on-board sensor pre-integrated into the fuselage (digi-tal, thermal, or multispectral camera). The structure of these light UAVs is mainly composed of expanded polypropylene (EPP). They are set going by a propeller driven by an electric motor. The flight speed is generally between 40 and 70 km/h. Models with autonomy of one hour can cover more than 50 km in linear flight. Fixed-wing UAVs are typically equipped with piloting assistance functions, all stages of the flight being managed by the autopilot. They are launched by hand and perform gliding and belly landing. The main constraint of compact and light fixed-wing UAVs is the need for a clear, unobstructed environment during take-off and landing.

In the case of fixed-wing UAVs with several sensors, as for their multi-sensor multirotor coun-terparts, it is generally complicated to integrate the complete instrumentation into the fuselage and UAV dimensions have to increase in proportion. For those UAVs, the wingspan can reach more than 2 m while maintaining a weight of less than 25 kg. They are capable of travelling long distances (over 100 km), propelled by an internal combustion engine. However, in the absence of wheel landing gear, and as they become too heavy to be launched by hand, a catapult is generally used for take-off, while recovery will require a net or a sling. In the event of use on-board a ship, the catapult and sling are indispensable. The use of this type of UAV can require special quali-fications for both the pilot and the UAV itself, a subject that is beyond the scope of this chapter. 1.3.1.3 Hybrid and multipurpose UAVs Combining the advantages of the VTOL of rotary wing (rotorcraft) UAVs and the large autono-my of fixed-wing models, hybrid UAVs have recently become available. In that case, an electric motor/rotor ensures the VTOL phases, while the wings are used for horizontal and level flights using either an electric motor or an internal combustion engine.

To increase the flight autonomy of a multirotor UAV, it can also be made captive by contin-uously powering it from the ground via an electric cable. The cable can also be used to recover large data flows (increasing image quality in real time) and to avoid the risk of the drone flying

43

away. Flight is then only possible in hover mode. This sort of UAVs can be used for remote sur-veillance and telecommunication relay applications. It is nevertheless possible to increase the range of a UAV by installing the winch and the operator on-board a moving vehicle (car or boat).

Finally, to allow flights under heavy rain or snow, it is necessary to reinforce the water tight-ness of UAVs and their components, or to choose a UAV designed for this purpose. It is possible to find amphibious drones capable of operating both underwater and in the air. However, the si-multaneous use of such UAVs for both environments constitutes a compromise that diminishes the performances obtained in either environment.

To sum up this first chapter content, in order to choose the right type of UAV, it is important to precisely define the scientific requirements and, in particular, the main sensor to be used. Then, using easy-to-collect information on the operating environment, it is possible to define the type of UAV required by following the indications given in Table 1.3-1 below (for UAVs up to 25 kg). Table 1.3.-1: Summary of the autonomy, load, constraints, and advantages of each type of UAV.

*VTOL: Vertical Take-Off and Landing – **RTF: Ready To Fly 1.3.2 Components

Historically, airborne data acquisition like aerial photography was performed using aircrafts pi-loted by a person on board. In order to reduce mission cost, and to access the aerial domain regardless of location, weather conditions and conventional aircraft availability, model aircrafts carrying a self-triggering camera were conceived. At first, model aircrafts were radio-controlled remotely from the ground without any assistance on board (Figure 1.3-2). Even with the sup-port of an experienced pilot, issues such as a poor flight stability preventing good pictures to be obtained and large geolocation uncertainty were common. In the end, this technique generated more waste than useful data (~80 % of low-quality photos). Thus, in order to retrieve the stability of the traditional aircraft, it became essential to ‘put the pilot back on board’ but in an electronic form. The autopilot was born and with it, the drone. Today, using the latest generations of auto-pilots and associated UAVs, it is possible to achieve almost 99.9 % of good photos exploitable in photogrammetry.

45

Figure 1.3-2: Feedback and pilot principles for a manned remote-control aircraft.

Image credits: Jérôme Ammann LGO, CNRS – UBO.

UAVs or drones are basically model aircraft equipped with a programmable autopilot and a navigation system (based on GNSS) that render them virtually automated (Figure 1.3-3). It has thus been possible for them to be developed very rapidly for the civilian and, more particularly, scientific communities. Faced with the rapid expansion of UAVs that are freely available for ci-vilian use, civil aviation authorities have had to regulate their use and their flight areas in order to integrate them into air traffic (e.g., French DGAC, 2012, European EASA, 2021). UAVs are today considered as air users in the same way as other aircraft. Their use is therefore subject to constraints (declaration of the UAVs and pilots, activities, and incidents), and their equipment and components are subject to a duty of reliability.

46

Figure 1.3-3: Feedback and autopilot principles for an Unmanned Aircraft Vehicles (UAV).

Image credits: Jérôme Ammann LGO, CNRS – UBO.

A UAV for scientific applications comprises five basic components:

• an autopilot controlling the stability and positioning of the drone, and its trajectory in the air;• a communication and telemetry system providing the link between the pilot on the ground and the drone;

• a motorisation system allowing the drone to be moved in three dimensions;

• a platform for payloads (scientific instrumentation);

• an independent system ensuring safety in the event of a major drone failure.

These different components are detailed in the following sections.

47

Architecture and functions. The autopilot behaves like a real on-board pilot. In other words, it must first and foremost ensure flight stability in all conditions and pilot the UAV to a desired destination. Even in the event that the connection with the drone pilot on the ground is lost, the autopilot is able to continue along its programmed flight plan. Physically, the autopilot is a pro-grammable microcontroller (Figure 1.3-4) that manages flight controls and data flows from the on-board sensors. It comprises one or more printed circuit boards (PCB). The autopilot controls the UAV stability (attitude) by analysing the data coming from the Inertial Motion Unit (IMU) several times per second.

Figure 1.3-4: Overview of autopilot operation.

Image credits: Jérôme Ammann LGO, CNRS – UBO.

The IMU comprises a set of motion and orientation sensors: a gyroscope for angular rotation movements, an accelerometer for linear movements, and a magnetometer for orientation with respect to the magnetic north, i.e., a total of nine sensors for the three flight axes (pitch, roll and yaw). The autopilot positions the UAV in a terrestrial frame of reference and measures its speed

48

using a GNSS receiver (GPS (USA), GLONASS (Russia), Galileo (Europe), BeiDou (China)). The accuracy of the positioning is metric (autonomous accuracy: ~5 m corresponding to the intrinsic value of the absolute GNSS; the accuracy can be improved up to 1 cm along the hori-zontal axis and ~2 cm along the vertical axis through the use of a Real Time Kinematic GPS system (RTK-GPS)) (more details in chapter 2.1).

The pressure sensor allows the precise measurement of the UAV’s relative altitude to the take off point (to within a few tens of centimetres). Some autopilots also incorporate optical, laser and ultrasonic sensors capable of detecting fixed or moving obstacles as well as ground location. Figure 1.3-5: Difference between fixed wing and multirotor UAVs.

Image credits: Jérôme Ammann LGO, CNRS – UBO.

In order to control the motion of the UAV, the drone pilot on the ground sends a direction change command to the UAV, which is materialized by an electrical signal transmitted to the autopilot. The autopilot acts on the motorization and on the control surfaces (air foils) of the three flight axes (pitch, roll and yaw) (Figure 1.3-5). In the case of an electric multirotor UAV, the autopilot controls the speed of rotation of the electric motors in accordance with the attitude

49

provided by the IMU. Upon receiving a change of direction command, the autopilot changes the rotational speed of some of the motors to create a movement in the desired direction. Hence, we can say that motorisation and three axis flight control are mixed together (Figure 1.3-5). In the case of a helicopter or fixed-wing UAV, the autopilot commands the control surfaces of each flight axis also on the basis of the attitude data provided by the IMU. Since the control surfaces are mechanical structures and the autopilot is a PCB, the autopilot transmits its commands to the control surfaces via electro-mechanical transducers called servomotors. There are as many servomotors as there are control surfaces. We can see here that the control/command technolo-gy of multirotor UAVs is simpler than that of helicopter and fixed-wing UAVs. This has greatly contributed to the rapid uptake and success of multirotor drones.

Open-source autopilots offer total freedom in the design of the flight functionalities. Indeed, the developer, often a drone pilot, can implement his/her own UAV behaviour functions. Devel-opers have federated into a large online community where functions, programs, interfaces, and information are exchanged to encourage and facilitate the development of their technology. Some open-source autopilots go beyond primary flight management functions, using communication protocols such as MAV-Link (Micro Air Vehicle Link) to interface with compatible user sensors.

Black box autopilots (Figures 1.3-6 and 1.3-7) are pre-programmed by the manufacturer for specific uses depending on the type of UAV. The user does not have access to the development of the primary flight management functions. The manufacturer provides access to certain parame-ters only for installing the autopilot on the UAV and programming flight plans. To ensure proper use, the manufacturer provides detailed documentation and html interfaces for programming and implementation from the ground.

Figure 1.3-6: (A) One-piece version of a black box autopilot (all the functions of the autopilot are in the same box). (B) Modular version of a black box autopilot (each function has its own box, here in grey). Image credits: Jérôme Ammann LGO, CNRS – UBO.

50

Failsafe mode and redundancy . What happens in the event of an autopilot failure? Imposed by the regulations as an autopilot safety function, the failsafe mode switches the UAV operation to a specific behaviour in the event that a failure is detected. Depending on the developers’ choice, this behaviour can notably be a return to the point of origin at a specific altitude (return to home), an emergency landing at the UAV’s location, or a looped flight of the UAV at a specific altitude. To ensure the correct operation of the UAV throughout the mission, some manufactur-ers prefer to increase the reliability of their autopilot systems by doubling, or even tripling, all of the associated functions and sensors (Figure 1.3-7). This is known as redundancy.

Figure 1.3-7: Push-pull motorized UAV equipped with an autopilot with triple function and navigation redundancy (three IMUs and three GNSSs, two RTK-GPS, advanced diagnostic algorithms, compass capable of resisting magnetic interference from metal structures). Image credits: Jérôme Ammann LGO, CNRS – UBO.

Caution. The programmed obsolescence by some manufacturers of their autopilot systems re-duces the service life of the UAVs concerned. This can notably be the case of mass-produced

51

black box autopilots. We find here the logic of certain computer operating systems, where the manufacturer no longer maintains either the firmware version of the autopilot or the compati-bility with the ground station software, and sometimes both. In this way, the user is encouraged to replace the autopilot, and even the whole UAV in the case of RTF type drones. For both open-source and black box autopilots, it is essential to check the validity of the firmware version before updating it. This can be done by consulting the manufacturer’s documentation and user websites. If the drone system is working well, then it is perhaps not worth taking the risk of updating the firmware. 1.3.2.2 Communication system

The communication system must be able to transmit the flight controls and flight plan to the UAV, and to control the payload. It must also be able to receive data on the status of the UAV and the telemetry values (UAV position, alarms, remaining battery capacity, the voltage, current and fuel levels for internal combustion engines, motor/engine speed of rotation, etc.), video feedback (good continuity of images), and payload measurements. The transmission system can be im-plemented by one or more transceiver units, preferably point-to-point (P2P) for UAV piloting or via relay stations to increase the operational distance. Below, we present the main frequencies and power levels that are used for UAV data transmission.

The 2.4 GHz band (2400–2483.5 MHz) is freely usable for wideband data transmission type equipment (sub-class 22) compliant with European standard EN 300 328, which includes new technology radio control units with spectrum extension. Since 2012, the permitted power level is 100 mW.

The 5.8 GHz band with a power limit of 25 mW is used for the transmission of video feedback and First-Person View (FPV) flight. The range can be improved by increasing the gain of the omnidirectional quarter-wave dipole antennas to 2 dBi (decibels-isotropic) by lobe antennas with better gain (5 dBi, 8 dBi, etc.).

The WiFi 802.11 standard (2.4 GHz and 5.8 GHz bands) offers a data rate of several hundred Mbps and a range limited to 100 m. By correctly tuning the antenna gain and by reducing the data rate to 150 Mbps, as is the case with WiFi Air Max 802.11n, ranges of several kilometres can be reached in free-field conditions (output power 27 dBm).

The 4G LTE (Long Term Evolution) standard set up for mobile telephony to ensure very high-speed wireless access to Internet (150 Mbit/s) allows communication with the UAV with-out any distance concerns. Flying the UAV via the Internet can pose security problems (network stability, latency, hacking). It is, therefore, advisable to use it only for piloting the on-board pay-load.

52

Propeller motorization principle . Most of the UAVs described above are powered by one or more propellers. The propeller is the interface between the engine or motor and the air in which the UAV is flying. It provides the thrust required to move the UAV. The motor provides the nec-essary power for the propeller to convert its power into thrust. To obtain the best performances, it is necessary to ensure that the propeller lets the motor/engine run at its maximum power rating. In general, it is the motor/engine manufacturers who size the drive unit with a small range of propellers according to a specific range of use. It is important to respect that and not to try changing only the size of propellers sizes to improve UAV performances. For the latter, it is best to change the complete propeller motor/engine block. The advent of the electrical brushless motor has revived the use of electric motors to power UAVs. This type of motor delivers high power and has a high-speed dynamic. Its magnets are light and efficient. As it has no manifold, there is no friction between the rotor and the stator. This results in low inertia and is the reason why the motor is designated as ‘brushless’.

Principle of verification of the correct motorization of a UAV . A UAV in level flight should not use more than 50 % of the throttle. In other words, the thrust exerted by all of the motors at 50 % of their rotating speed (rpm) must allow the total weight of the UAV to be lifted at take-off. Total take-off weight includes everything: empty UAV weight, payload weight in operational configuration, and weight of all of the batteries necessary to ensure the desired autonomy. These data allow for estimating, in theory, whether the UAV is correctly motorized.

Caution. The autonomy values announced for a UAV with a free payload (other than a com-pact UAV) may be biased.

• They do not always account for the payload weight (test performed without payload).

• The maximum permissible payload weight may require additional batteries to advantageous-ly increase autonomy.

• The maximum autonomy duration corresponds to the moment when the battery is comple-tely run down (0 %). At 0 %, the drone falls out of the sky. It is, therefore, essential to always factor in a margin of 15–20 %.

In the case of your application with a particular payload, the autonomy will be well below that estimated by the manufacturer. Recalculating the theoretical autonomy from all the known data (weights, propeller thrust, number of motors/engines, normal or push-pull assembly, etc.) will make it possible to decide whether the proposed UAV is undersized with respect to the initial requirement. Brushless motors require little maintenance other than listening for unusual nois-es. These noises can mean that an attachment screw is coming loose, or that the propeller is

53

forcing too much and creating play in the motor. It is important to deal with these issues quickly to avoid a crash.

Batteries . Among the existing battery technologies (NiCd, NiMH, Li-ion, LiPo, LiFe, Pb, etc.), we will mainly focus on lithium batteries for UAVs because, at equal voltage and capacity, they are the lightest. The cells of a lithium polymer or LiPo battery (Figure 1.3-8) are connected in a stack of layers in which the power is concentrated. This technology can provide a high current without destruction. There are two important battery parameters:

The numbers of S and P . The battery pack is an assembly of parallel-connected (P) and se-ries-connected (S) cells. It is designated by the number of associated cells.

Example: 4S2P thus corresponds to a combination of four series-connected cells and two parallel-connected cells. If one cell has a voltage of 3.7 V and a capacity of 6,000 mAh, then the pack designated 4S2P will have a resulting voltage of 4 x 3.7 = 14.8 V and a total capacity of 2 x 6,000 = 12,000 mAh.

The number of ‘C’s indicated on the battery multiplied by the nominal value of the charge cor-responds to the instantaneous current that can be delivered.

Example: a battery with a charge of 6,750 mAh means that it can deliver this current for one hour. Multiplied by the number 25C indicated, it is capable of instantaneously supplying 25 x 6,750 or 168,750 A.

As lithium is a flammable material, it is considered a class nine product. Precautions must thus be taken when using these batteries, and notably (a) never exceeding the indicated charging current and (b) recharging in safety bag, in a ventilated area and under supervision. In all cases, users should consult the maintenance and user booklet supplied with the UAV and the battery usage instructions. Finally, when batteries are not use during several days, it is recommended to storage them (use the storage function of the loader that manage bat-tery charge to ~30–40 %); this recommendation increases the battery life time. Some kind of battery known as ‘Intelligent Professional Battery (IPB)’ are equipped with a microcon-troller that estimates the level of charge left in the battery and activates storage mode after three days of non-use.

54

Figure 1.3-8: How to read a battery sticker.

Image credits: Jérôme Ammann LGO, CNRS – UBO.

Generally speaking, lithium batteries offer little autonomy for multirotor type UAVs (under 40 minutes for a compact UAV weighing less than 2 kg, and under ten minutes for UAV weigh-ing 25 kg). Unlike the weight of the fuel of a combustion engine which decreases during the flight and can improve a little autonomy, the weight of the on-board battery remains constant throughout the flight even though the battery runs down. Depending on the UAV model, the weight of the battery represents between 22 % and 35 % of total weight at take-off. 1.3.2.4 Internal and external safety

Geofencing is a geolocation technology that allows creation of a virtual perimeter over a real geographical area that is commonly used to restrain UAV movements within a predefined pe-rimeter. Associated with the UAV’s GNSS, the geofencing function will trigger an alert or an ac-

55

tion (UAV shutdown) as soon as the virtual border of the authorized zone is crossed. Geofencing zones are of two types:

1. The zone created voluntarily by the user (Figure 1.3-9A): The majority of advanced UAV con-trol software includes a geofencing function that allows for determining, prior to each flight, a zone from which the UAV will not be able to deviate. This function can be very practical in the event of a flight close to a prohibited or sensitive area.

2. The zone created by the UAV manufacturers (Geo Fly zones) (Figure 1.3-9B): Some market leaders have integrated geofencing zones into their control software. These zones, close to air-ports, nuclear power plants and all sensitive places such as sports stadiums, Temporary Flight Restriction Zones, etc., do not even allow the UAV to take off (engine blockage). However, some of these zones can be temporarily unblocked on demand after the manufacturer has identified your UAV.

Figure 1.3-9: (A) Example of user-defined geofencing zone. This geofencing allows the UAV to fly over the river but prevents it from flying over the group of houses. (B) Example of a geofencing zone imposed by the UAV manufacturer. The colours indicate the degree of restriction (blue: authorization via website; orange and red: authorization via provision of official documents).

Image credits: (A) Philippe Grandjean, Univ-Lyon1. (B) Jérôme Ammann LGO, CNRS – UBO.

Anti-collision sensors allow the avoidance of obstacles as well as the maintenance of a safe alti-tude for the UAV with respect to the ground. They also assist piloting inside buildings when sat-ellites are inaccessible. Anti-collision sensors use a number of complex technologies that work together to create a global system. There are many sensors that can be linked with each other by software programming, and include mathematical modelling and algorithms for real-time orientation (e.g. Simultaneous Localization and Mapping, SLAM).

56

Table 1.3-2: Anti-collision sensor technologies. The combination of these various systems makes it possible to cumulate the advantages of the different technologies and eliminate the weaknesses associated with each system.

Emergency safety device against Fly Away . A Fly away is an incident due to unintended and unwanted behaviour on the part of the autopilot sending the UAV away from the flight zone and beyond the control of the pilot on the ground. To mitigate this risk, aviation regulations on UAVs require the implementation of an emergency safety device that can be executed by the drone pilot without any link to the general control system. An emergency safety device manifests itself in the instantaneous shutdown of the motors/engines, thus causing the UAV to fall out of the sky. However, a parachute can be added to this device to limit the impact on the ground.

Air traffic detection system . Automatic detection of airplanes or helicopters flying in the vicinity of a UAV will soon be possible. Indeed, UAVs will be equipped (from 2020) with a receiver that will detect ADS-B signals transmitted by airplanes and helicopters. As soon as an aircraft enters the range of the UAV, the drone pilot will be warned and will be able to see its exact position on a map.

57

Parachute . The main criterion for the choice of a parachute is its surface area, which, depend-ing on the weight of the UAV, will determine the rate of fall. The latter must be calculated to avoid any bodily injury during a UAV landing. However, the UAV and especially the on-board sensors can be damaged during deployment of a parachute.

Table 1.3-3: Advantages and disadvantages of parachute extraction systems.

Example: for a 4 m² parachute, the rate of fall will range from 2.6 m/s for a UAV weighing 2 kg to 4.4 m/s for a UAV weighing 7 kg. The impact energy will be between six and 68 joules. Airbag technology is still very little developed in the UAV world. Airbags are a complement to parachute protection. They are mainly developed to protect the payload. Indeed, with the in-creasingly high value of on-board sensors, it is necessary to protect them in the event of damage to the UAV. A watertight box can also be used for equipment that has to fly over water.

Caution. these safety devices are not considered by the manufacturer at the time of design, except in the specific case of custom UAVs. The weight of these devices will add to the total take-off weight and will reduce the final autonomy of the UAV. 1.3.2.5 Adaptation of the payload on the UAV

During flight, the voluntary or involuntary movements of the UAV generate vibrations that af-fect the on-board sensor. To compensate for this and to ensure the optimum functioning of the

58

sensor, two- or three-axis pods can be used to stabilize it. A two-axis pod pivots on two axes, roll (X) and pitch (Y), while a three-axis pod adds an additional axis, yaw (Z). Each type of pod comprises at least one mechanical stabilization (silent block) to dampen the vibrations, located at the interface between the UAV and the pod. The movements are also compensated by either servomotors or brushless motors. Servomotors are used more on two-axis pods or for sensors that require less accuracy. Brushless motors offer a movement accuracy better than 1/100th of a degree and allow rotation through 360°. They are ideal for three-axis pods and for stabilizing high-frequency acquisition sensors (video, LiDAR, hyperspectral, etc.).

The main characteristics of a pod include the:

• number of motion axes (usually two or three);

• axis travel and stabilization accuracy (in degrees);

• type of motor (servo or brushless);

• payload capacity (maximum weight and dimensions);

• electronics managing the stabilization (specific IMU or autopilot);

• specific position functionalities (nadir, follow-me, video tracking, etc.);

• power supply and consumption;

• intrinsic dimensions (length, width, height, weight). 1.3.3 Operation 1.3.3.1 Flight modes

Manual mode . Visual flight depends on the size and distance of travel of the UAV. It is used primarily for the take-off and landing phases. By managing the climb and descent path properly, the pilot can save battery time compared with landing in automatic mode. The pilot can follow flight parameters displayed on a screen interface known as IOSD (Figure 1.3-10A and 1.3-10B). Manual flight in First Person View (FPV) allows the UAV to be flown as if the drone pilot were on board. The pilot is equipped with a head-mounted display (Figure 1.3-10C) that integrates his/her entire view into the UAV. With this mode, the pilot can easily extend the distance and forget the point of origin. It is thus recommended to have two pilots in FPV mode. The second pilot, not equipped with a head-mounted display, monitors the UAV behaviour in direct view and its movement on the map to avoid risks due to distance of travel (disorientation, loss of the radio link, etc.) and collisions.

Automatic Mode . This is recommended for scientific operations (measurements, photographs, etc.). The UAV flies in a more stable and regular manner than in manual mode. It follows its flight

59

plan by linking a succession of GNSS waypoints programmed during the mission-planning phase. Flying in automatic mode controls all actions during flight like sensor triggering at waypoints.

Figure 1.3-10: Interface On-Screen Display (IOSD): (A) It contains some flight indicators like heading, speed, attitude, turn and slip, numbers of GNSS received and geographic position, battery voltage level, UAV distance from ground station, map, flying mode. (B) IOSD on tablet screen. C) IOSD on helmet for FPV flight, comprising 2 radio antennas for connecting with the UAV. Image credits: Jérôme Ammann LGO, CNRS – UBO.

Specific Mode . Some UAVs offer specific flight modes, such as turning around an object at a fixed distance and directing the camera or sensor at the object (turn around function), or following a person wearing a transmitter (follow-me function). All these specific modes are nothing more than flights in automatic mode where the form of the flight plan is pre-set by the manufacturer’s software application. 1.3.3.2 Mission planning

Mission planning is the flight preparation phase. In this phase, all the data required is gathered (flight zones, authorizations, resolution, and accuracy required, on-board sensors, UAVs to be used, flight altitudes, weather, back planning of operations, etc.) and the flight plans are developed.

Flight plan and software interfaces . Most ground station software offers touch interfaces that can be used on a tablet or smartphone. (Figure 1.3-11) The flight plan is programmed on a geo-normed image by tracing the limits of the plan or via a KML file (Google Earth®). The waypoints

60

are automatically positioned according to the sensor swath and the image overlap rate, the UAV’s speed, and the flight altitude, which can be user defined.

Figure 1.3-11: Creation of a flight plan via a tablet or smartphone compatible with the flight interface: (A) DJI GS PRO (works only with Apple IOS). (B) Litchi (works also with

IOS and Android OS). Image credits: Jérôme Ammann LGO, CNRS – UBO.

It is important to consider the terrain topography when deciding flight altitude and take-off site especially in the case of steep terrains. Altitude zero is set at the take-off place. The flight altitude determines the spatial extent and resolution of the survey. (Figure 1.3-12)

Figure 1.3-12: Adapt the flight altitude to the local topography.

Image credits: Jérôme Ammann LGO, CNRS – UBO.

61

It is also possible to orient the flight lines in relation to the terrain, the wind direction, or the re-flection of the sun on the water (sun glint), with the software automatically readjusting the flight plan. In principle, the UAV’s speed is controlled to remain constant in relation to the ground. The autopilot adjusts the UAV’s speed and movements depending on the instantaneous wind conditions. In the event of strong winds, it is advisable to postpone the flight. In principle, the manufacturer provides a maximum limit value for the operating wind speed.

Most ground station software applications require an internet connection to retrieve the top-ographic base. This is memorized after saving the flight plan. The software prompts the pilot to choose three parameters (Figure 1.3-13): the image sensor model proposed in the list, the over-lap rate between two images, and the desired ground resolution in pixel per cm. Depending on these three parameters, the software calculates all the others (flight altitude and distance, UAV speed, duration of the flight plan, etc.). The total duration of the flight, considering the time to climb to and descend from the working altitude remains to be estimated in order to evaluate the battery or fuel autonomy. It is important to increase the flight time to ensure a safety margin for energy autonomy (20 % of energy remaining). It will therefore sometimes be necessary to divide the site into several flight plans.

Figure 1.3-13: Setting autopilot’s user parameters for mapping.

Image credits: Jérôme Ammann LGO, CNRS – UBO.

62

Checklist when planning a flight . The checklist process starts in the office and continues in the field before the flight, then after the flight once back in the office. Here is a summary table of the relevant actions in chronological order.

Table 1.3-4: Checklist of steps to prepare, carry out and complete a UAV flight.

References for further reading

64

1.4 Legal considerations of UAV flights

Claudia Stöcker and Jaap Zevenbergen

1.4.1 Development of UAV regulations .............................................................................................631.4.1.1 From past to present at a global scale .........................................................................631.4.1.2 European efforts towards harmonised UAV regulations .........................................66

1.4.2 Content of UAV regulations ......................................................................................................681.4.2.1 Regulatory approaches .................................................................................................681.4.2.2 Requirements towards UAV, UAV operator and UAV pilot ....................................691.4.2.3 Operational aspects .......................................................................................................701.4.2.4 Privacy and ethics-related aspects ..............................................................................721.4.3 Joint responsibility ......................................................................................................................73Increasing operational capabilities of UAVs as well as improved hard- and software compo-nents are raising severe concerns about public safety, privacy and data protection. Therefore, more and more national and international authorities introduced legal provisions that regu-late the use of UAVs. Such regulations significantly impact how, where, when and by whom UAV-based data can be captured. This chapter is based on Stöcker et al., 2017 and Stöcker, 2021 and provides an overview of past and present developments as well as important aspects of current regulatory frameworks. 1.4.1 Development of UAV regulations 1.4.1.1 From past to present at a global scale

The history of UAV regulations dates back to manned aviation and the emergence of aeroplanes during World War II. In 1944, the international community established the first globally ac-knowledged aviation principles—the Chicago Convention. Besides the main focus on require-

65

ments for safe and secure flights in manned aviation, one article addresses pilotless aircraft and highlights the need for special authorisation of UAV operations.

Due to the early developments of UAVs in the form of manipulated model aircraft, UAV opera-tions were usually conducted under respective regulations for model aircraft. In the 2000s—after years of technological research and innovation—UAVs developed into a commercially workable system for a wide field of applications. Hence, in 2006, the International Civil Aviation Organi-zation (ICAO) identified and declared the need for international harmonised terms and princi-ples of the civil use of UAVs (ICAO, 2015). To strengthen the operation of UAVs throughout the world in a safe manner, ICAO published Circular 328 AN/190 in 2011 as a first step to provide a fundamental international regulatory framework through standards and recommended prac-tices. In 2016, the same organisation published an online toolkit that delivers general guidance for regulators and operators. ICAO further issued recommendations to the safe integration of UAVs into controlled airspace. In those, UAVs are “(…) envisioned to be an equal partner in the civil aviation system [that is] able to interact with air traffic control and other aircraft on a real-time basis” (ICAO, 2015). As this manual mainly focuses on global harmonisation of UAVs in air traffic-controlled environments, lower priority is granted to visual line-of-sight (VLOS) operations (ICAO, 2015).

At a national level, the UK and Australia were the first nations that promulgated regulations in 2002. Some European countries, as well as the US, Canada, Brazil, and Russia, followed during the next years. As visualised in Figure 1.4-1, the vast majority of countries – particularly in Asia and Africa – remained without regulations during that time. Only after 2012, aided by guiding documents of the ICAO as well as a continually growing UAV market, the number of countries with enacted UAV rules and regulations increased significantly.

66

Figure 1.4-1: Worldwide overview of the first release of UAV regulations

(source: Stöcker, 2021).

As of October 2019, more than 50 % of all nations have documents containing specific instruc-tions for the use of UAVs (Figure 1.4-2). Most of these documents refer to regulations which are enforced by law whilst a few countries published only guidelines or public notices as the law-making process is still in progress. In 2019, six nations banned the use or even the import of UAVs in the country (Kenya, Egypt, Uzbekistan, Brunei Darussalam, Cuba and Morocco). Figure 1.4-2: Status of national UAV regulations at a global level.

Internet sources of relevant international UAV organisations or crowd-sourced platforms pro-vide useful links and precompiled overviews to derive general information on the status of na-tional UAV regulations, a shortlist of related resources is listed in Table 1.4-1. Due to the rapid

67

emergence of and ongoing changes to UAV regulations, none of the collections provides a relia-ble, complete and coherent picture of regulations and before undertaking any UAV mission in a country, the information should always be validated with official documents of national aviation authorities.

Table 1.4-1: Overview of online compiled lists handling UAV regulations.

Besides the official ‘hard’ law regulations, soft law is increasingly gaining importance in guiding the development of the UAV market as well. As an example, in 2019, the international stand-ardisation organisation has published the first UAV-related standard (ISO, 2019) which specifies internationally agreed and accepted requirements for safe commercial UAV operations. This standard includes protocols on safety and security, data protection, the operator, the airspace, facility and equipment, requirements, operations, and maintenance; and will support shaping future UAV legislation. 1.4.1.2 European efforts towards harmonised UAV regulations

Besides national efforts to introduce UAV regulations, international organisations took initia-tives in parallel. At the European level, the European Commission set up the European RPAS steering group (ERSG)—a gathering of organisations and experts in this field. A critical step towards the integration of civil UAVs into the European aviation system was made with the pub-lication of the Riga Declaration on Remotely Piloted Aircraft in 2015. This declaration highlights five main principles that should guide the regulatory framework in Europe (EASA, 2015):

68

1. Drones need to be treated as new types of aircraft with proportionate rules based on the risk of each operation;

2. EU rules for the safe provision of drone services need to be developed now;

3. Technologies and standards need to be developed for the full integration of drones in the European airspace;

4. Public acceptance is key to the growth of drone service;

5. The operator of a drone is responsible for its use.

Regulation of UAVs below 150 kg was handled by all member states individually until August 2018, when with Regulation (EU) 2018/1139, the European Commission received the order to regulate all sizes of UAVs (European Parliament, 2018). Following its mandate, EASA pub-lished the first common European UAV rules in Summer 2019, which has come into effect as of January 1st 2021 and will replace existing national provisions (European Commission, 2019). Ultimately, this regulatory reform allows to harmonise the European UAV market and enables UAV pilots to easily accomplish UAV flights in the EU without struggling with heterogeneous national legislation. While aiming primarily at ensuring safe operations of UAVs, the European regulatory framework will also facilitate the enforcement of citizen’s privacy rights and contrib-ute to addressing security issues and environmental concerns. The current approach is risk-based (1.4.2.1) and distinguishes three main categories applicable for commercial and recrea-tional users alike: the low risk “open category”, the “specific category”, and the high risk “certified category”. Specifications are outlined in Figure 1.4-3. In this scheme, European and national aviation authorities share responsibilities for authorisation. The regulations are planned to be fully implemented by national aviation authorities by January 2023.

Figure 1.4-3: Categorisation of UAV operations according to Commission

Implementing Regulation 2019/947, based on EU, 2020.

69

1.4.2 Content of UAV regulations

Besides the great advantage of UAV applications, two main risks are associated with its op-eration; firstly the collision with other airspace users and secondly the impact of UAVs with the ground and objects on the ground. Generally speaking, UAV regulations exist to manage associated risks and to minimise potential harm to people and property to an acceptable level. As to that, requirements towards operators, UAVs and pilots (1.4.2.2) as well as operational lim-itations for flying (1.4.2.3) are inherent parts of UAV regulations. With an intensifying societal debate about privacy and the potential misuse of UAV technology to seriously violate privacy, an increasing number of UAV regulations also include provisions about privacy, ethics and data protection (1.4.2.4). 1.4.2.1 Regulatory approaches

Different regulatory approaches can be taken to maintain air safety and public safety. In the early days and to some extent even at present, UAV regulations are based on case-by-case au-thorisation in which civil aviation authorities treat every application for a flight mission as a stand-alone exercise. However, this approach is very time-consuming and mostly fails to provide regulatory certainty as deemed necessary to accelerate the UAV industry. As an example of the United States in 2010, (Rango & Laliberte, 2010) concluded that the regulations by the Federal Aviation Authority are restricting the progress in UAV natural resources mapping.

Pushed by the quick development of numerous UAV applications and increasing demand for more efficient and practical regulatory procedures, most countries follow a risk-based approach nowadays. Regulations set out proportionate requirements following the rational of the level of operational risk posed by a UAV (Washington et al., 2019). Most aviation authorities distinguish risk categories based on the weight, the site, or the operational complexity. As a rule of thumb, the higher the risk category, the stricter the operational conditions under which a flight author-isation is granted.

Figure 1.4-4 shows the example of the Canadian regulatory approach illustrating the different risk categories at hand. All UAVs under 250 g are exempted from the regulations. In the weight category of 1 kg to 25 kg, sub-categories are distinguished according to the site of the UAV op-eration. Remote areas are categorised at a lower danger level than built-up areas as well as areas close to aerodromes. Independent of the weight, Beyond Visual Line of Sight operations are all rated with a high-risk category which adds operational complexity as the third dimension to the risk assessment.

70

Figure 1.4-4: Canadian UAV risk categorisation as an example of a risk-based regulatory approach. Based on a graphic representation by Spectral Aviation

Canada 2020 (http://spectralaviation.com/en/summary-new-drone-regulation/). 1.4.2.2 Requirements towards UAV, UAV operator and UAV pilot

Depending on the associated risk of the flight operation, requirements towards the UAV, the UAV operator and the UAV pilot are part of all UAV regulations. Predominantly for commercial purposes but increasingly also for recreational uses, most regulators call for formal registration as well as iden-tification marks on the UAV. Even though compliance to a minimum level of technical airworthiness is deemed necessary in most regulatory frameworks, special airworthiness certificates as mandatory for manned aircraft are less relevant for small UAV up to 25 kg, except operated under special con-ditions (cf. Federal Aviation Administration (2016)). Besides general airworthiness, most legislative contexts require automated procedures to terminate the flight in case of system failure. Such emer-gency cases can be caused by, e.g. breakdown of essential components for a safe flight (motor loss or damage to wings), loss of data link or insufficient battery to complete the mission. Measures to deal with such situations are called fail-safe systems or flight termination systems and must be able to allow for human independent system guidance and ensure that all safety objectives are achieved.

Particularly for UAV flights that are not for recreational purposes, UAV regulators often an-ticipate specific approval of the UAV operator, which is seen as the superior unit (such as a company or a university) managing the UAV flights. The preparation of an operational manual is vital for such organisations to become a UAV operator and is seen as a kind of contract be-

71

tween the national civil aviation authority and the organisation. The main content includes risk and site assessments, emergency procedures, technical details and operational limitations of all UAVs as well as documentation of UAV pilot qualifications. In most cases, the proof of sufficient third-party liability insurance is mandatory for UAV operators. Besides the UAV and the UAV operator itself, many regulations include demands upon the UAV pilot. Here, practical training, theoretical knowledge tests, aeronautical tests, and medical assessments encompass the most common requirements (Stöcker et al., 2017). In most countries, the level of required pilot skills tends to depend on the risk category of the flight mission. 1.4.2.3 Operational aspects

Operational limitations refer to elements that pose certain restrictions towards UAV flight mis-sions. Most prominently are so-called no-fly zones, which must not be entered by a UAV. Typ-ically, no-fly zones are defined around airports and airstrips, natural protection areas, repre-sentative buildings, or congested areas. In addition to permanent restricted areas, emergency situations such as police operations or fire brigades might be subject to temporal UAV flight restrictions in other areas, too.

To ensure clear segregation between manned and unmanned aircraft, UAVs are usually al-lowed to operate only in uncontrolled airspace (airspace class G) which is not managed by air traffic control (ATC). In contrast, controlled airspace around airports (airspace class B, C, D) and at specific altitudes (airspace class A, E) is subject to ATC service provisions and designated for manned aircraft. Thus, controlled airspace is commonly considered as a no-fly zone, and the accomplishment of commercial or recreational UAV flights would require special authorisation or a waiver (see Figure 1.4-5).

72

Figure 1.4-5: Airspace guidance for UAV operators in the United States

Image credits: Federal Aviation Administration, 2018.

Another aspect that can be found in almost all UAV regulations worldwide refers to the maxi-mum flight height. Despite a few exemptions, an altitude of 120 m (400 ft) above ground level (AGL) is considered as the upper bound of permitted UAV flight height. This homogenous criterion is closely linked to the minimum safe altitude for aircraft which is typically 150 m (500 ft) AGL in non-congested areas and 300 m (1000 ft) above field elevation in congested areas. Besides the precise definition of a maximum flight height, the horizontal distance be-tween the UAV pilot and the UAV is also a typical yet differently defined aspect in national UAV regulations (Stöcker et al., 2017). As shown in Figure 1.4-6, three ranges can be dis-tinguished: visual line of sight (VLOS), extended visual line of sight (EVLOS) and beyond visual line of sight (BVLOS). In VLOS conditions, the pilot must be able to maintain direct unaided visual contact with the UAV. If not amended by a specific distance, the required VLOS range can be subject to various interpretations. EVLOS involves an additional person as an observer during the UAV mission and extends the horizontal distance to the distance that the external observer can keep visual contact with the UAV. The observer communicates critical flight information and supports the pilot in maintaining a safe distance from other airspace users. BVLOS refers to the range outside VLOS but still within the radio line of sight which is required to keep full (manual) control over the UAV. Further operational limitations can in-clude temporal aspects (day/night operation) or distances to people, vessels and infrastructure amongst others.

73

Figure 1.4-6: Schematic distinction between UAV flight ranges: visual line of sight (VLOS), extended visual line of sight (EVLOS) and beyond visual line of sight (BVLOS).

Image from Stöcker et al., 2017 (license: Creative Commons Attribution http://creativecommons.org/licenses/by/4.0/). 1.4.2.4 Privacy and ethics-related aspects

The aspect of privacy and data protection in relation to the increasing use of UAVs under-lines one currently widely discussed topic (Marzocchi, 2015; Nelson et al., 2019). As shown in chapter 1.3, UAVs can be equipped with multiple payloads such as imaging equipment or transmitters which can easily capture and record data of people, houses or other objects and thus potentially violate the privacy and data protection rights of a citizen. In a survey, (Finn & Wright, 2016) identified particularly private users and law enforcement as the category of drone operators that pose a high risk to privacy, data protection and ethics. In 2017, the in-clusion of privacy and ethics-related aspects in national UAV regulations remained very low (Stöcker et al., 2017). However, during the past two years (2019 and 2020), this issue gained importance, and an increasing number of UAV regulations refer to existing national and inter-national data protection and related privacy regulations such as the General Data Protection Regulation in Europe.

74

The key challenge appears to be to find an optimum balance between the demands of the various actors; allowing for innovation on the one hand, but at the same time ensuring recognition and support for safety, fundamental human rights and civil liberties. The future development of civil UAV use will ultimately involve multiple interest groups and various motivations (Rao et al., 2016). Government institutions and regulatory bodies holding political mandates want to en-sure public safety and security, civil liberties, but also to promote UAV innovation and technol-ogy innovation more generally. Stakeholders in research strive for UAV technical advancement. Hardware and software manufacturers aim to sell products and are interested in lowering mar-ket barriers and opening up new application areas. End users have their own needs and market interests according to their priorities.

It can be predicted, that over the next decade, technology, societal acceptance and regulation are converging. As showcased in this chapter, remarkable progress has already been made as more and more countries are establishing risk-based regulations as a fundamental basis to un-lock the full potential of UAVs for their economies.

The bottom line is that all users should comply with the rules and regulations, even though compliance assessment and compliance finding might be in the early stages of development (Washington et al., 2019). Otherwise, if widely publicised incidents happen, the risk-based sys-tem might get jeopardised, and the current balance for regulating UAV missions might be revis-ited and even lead to a ‘no, unless …’ system in many more cases.

References for further reading

76

1.5 Guidelines for flight operations

Anette Eltner and Mike R. James

1.5.1 Flight settings ...............................................................................................................................761.5.1.1 Pre-flight planning ........................................................................................................761.5.1.2 Flight plan ......................................................................................................................771.5.1.3 Georeferencing ..............................................................................................................80

1.5.2 Camera settings ...........................................................................................................................811.5.2.1 Sensor size and image format ......................................................................................811.5.2.2 Lens and focus ...............................................................................................................821.5.2.3 Exposure .........................................................................................................................831.5.2.4 Geometric camera calibration .....................................................................................83Every UAV mission should be planned and prepared carefully, to ensure safe operations and to maximise the likelihood of successful data acquisition. Project aims and flight goals should be defined first, from which suitable equipment can be identified and the necessary information sourced for mission planning. Short missions may benefit from the portability and manoeuvra-bility of small, battery-powered multi-copter-style UAVs, whilst larger mapping missions may require the extended range of long-endurance fixed wing systems. Flight permissions, safety and pre-flight checklists, as well as data acquisition and processing protocols, are essential pre-req-uisites for effective UAV missions, and the importance of suitably trained and experienced pilot and observer should not be underestimated.

77

1.5.1 Flight settings 1.5.1.1 Pre-flight planning

The pilot must be familiar with the region in which the UAV will be used. If they are not already experienced in the area through previous work, then maps, satellite and aerial images can be studied to locate suitable areas for launch and landing, and to identify hazards such as power lines. Any no-fly zones near the area need to be located and plans developed to ensure they are avoided. For georeferencing data, good ground control point (GCP) locations can be estimated whilst in the office to ensure optimal coverage of the area of interest.

The expected environmental conditions for each flight should be assessed beforehand. For instance, elevated atmospheric salt concentrations might be considered for flights over a littoral zone, forests can make ensuring visible line of sight challenging, and sand can damage engines and rotors in desert environments. Weather is an important consideration and suitable mission dates should be identified from forecasts of appropriate flight conditions, i.e. low wind speed, no rain and, ideally, overcast cloud conditions for UAV photogrammetry. Of course, the prevailing weather conditions should be continuously reviewed up to and during a flight. GNSS signal availability should also be assessed (including possible interference from geomagnetic activity) because this can influence the reliability of aircraft control. Multiple apps are available to deter-mine the suitability of weather and GNSS conditions for UAV flights (e.g. UAV Forecast3).

Prior to each flight, all legal obligations must be met, keeping in mind that national and inter-national regulations can be updated frequently (chapter 1.4). Where required, flight permissions must be acquired and fully documented. The use of safety and pre-flight checklists, which in-clude personnel information as well as mission details, UAV settings and safety issues, is highly recommended. James et al. (2020a) provide examples4 (see their supplementary material for a ‘UAV Project Aviation Safety Plan and Signatures’ form from the USGS, and a pre-flight check-list) which can be adapted to individual requirements. Furthermore, a detailed documentation of the data acquisition and subsequent processing should form part of any scientific UAV mis-sion and an example protocol5 is provided by Eltner et al. (2016). Immediately prior to each flight, and continuously during it, the vicinity should be checked for the presence of people and other aircraft. Again, apps are available to monitor flight activity and flight zones (e.g. UAV Forecast3, Airmap6). 1.5.1.2 Flight plan

An appropriate flight plan will optimise data acquisition under safety, environmental and equip-ment constraints (e.g. maximum flight duration). These constraints will vary from site to site and from mission to mission, so here we’ll consider only the initial data acquisition aspects. In most scenarios, UAV data acquisition should be carried out in autopilot mode to facilitate controlled data capture. Data acquisition should be planned to generate suitable overlap (along and across flight lines) to satisfy data processing requirements and to avoid data gaps. The recommended overlap will vary depending on the sensor used; for example, UAV laser scanning (ULS, chap-ter 2.6) has different requirements to those for UAV photogrammetry (chapter 2.2) or UAV multispectral sensing (chapter 2.5).

UAV photogrammetry is widely used in environmental sciences, so we provide some specific flight plan recommendations for acquiring suitable imagery. The science objectives guide survey design by determining the smallest features needing to be detectable in imagery, or the spatial accuracy needed within topographic products. For instance, mapping soil surface roughness may require sub-centimetre resolution, whereas quantifying large magnitude river bank erosion may require only decimetre spatial resolution and accuracy. These requirements will allow the survey’s ground sampling distance (GSD) to be estimated, but note that the GSD influences, but does not define, the survey accuracy (which is also a function of the photogrammetric image network and georeferencing, chapter 2.3). Horizontal survey accuracy can often achieve a value smaller than the GSD, but vertical accuracy is usually substantially poorer. However, carefully acquired, high-quality image networks, that are well supported by GCPs, can achieve vertical errors smaller than the GSD.

Considering GSD as the size of an image pixel on the ground relates it to the selection of cam-era (e.g. focal length and pixel pitch) and flying height:pixel pitchGSD =flying heightfocal length

In many practical cases, there may be a limited or no choice in UAV or camera, and flight height may be the only variable. For example, if a GSD of 2 cm is required and the available camera has a focal length of 5 mm and a pixel pitch of 1.7 µm, then a flying height of about 60 m is needed.

The camera and flight height define the size of the image footprint on the ground, from which the distance between image acquisitions (i.e. the base, Figure 1.5-1) is determined to ensure overlap between adjacent images. Along a flight line, image overlap should be a minimum of 60 % and, between adjacent parallel flight lines, overlap should be a minimum of 20 %. How-ever, significantly higher overlaps along (80 %) and between (60 %) flight lines are usually rec-ommended to increase the number of matched image points for photogrammetric processing. Higher overlaps enable the same object point to be observed in more images with similar views, which usually increases image matching success. This can be especially important in otherwise difficult-to-match images, such as those from areas of extensive vegetation cover.

Figure 1.5-1: (a) Influence of flying height, pixel pitch and focal length on ground sampling distance (GSD) and image overlap. For a constant base (the distance between camera projection centres), image overlap decreases with increasing focal length (a to b) and increases with increasing flying height (b to c). All images were prepared by the authors for this chapter.

The most accurate topographic models are achieved from ‘strong’ photogrammetric image networks in which the position and orientation of each captured image can be estimated relia-bly, as well as the camera calibration parameters. Conventional aerial surveys (e.g. Figure 1.5-2a) tend to result in a relatively weak image network, but networks can be strengthened by capturing each feature in more images from a wider range of different camera orientations and by including different viewing distances within the survey. Consequently, a flight plan will usually represent a trade-off between spatial coverage and the strength of the resulting image network.

80

In general, flight planning should consider the most appropriate camera calibration strategy, taking into account the accuracy demands and flight path options, which will be different for fixed wing and rotor-based UAVs. If the UAV has sufficient flight endurance, additional cross-strip flight lines should be carried out (Figure 1.5-2b). These increase the strength of the image network geometry, and they can be particularly useful in directly georeferenced surveys (where GCPs are not used; e.g. Gerke and Przybilla, 2016). A strong image network geometry is widely recommended due to it facilitating the most accurate results by enabling a high-quality ‘on-the-job’ camera calibration (chapter 1.5.2.4). Such calibrations are an effective approach because they are based on the survey images themselves (so calibrations are optimised for the specific surveys) and they don’t require additional flights or image sets solely for calibration. The flight plan can strengthen the image network geometry further by including multiple flying heights (especially over flat terrain or for images captured from high altitudes above the surface) and convergent images where possible, to decrease the likelihood of systematic errors in 3D surface models (chapter 2.3).

Figure 1.5-2: Flight planning for UAV photogrammetry, considering (a) overlap along the flight direction and across the flight strips, (b) improving the stability of the image network by performing cross-strip flights, (c) decreasing the potential of systematic errors (such as domes) by capturing convergent images (red) in addition to nadir images (green),

(d) improving focal length estimation by flying at different altitudes.

81

1.5.1.3 Georeferencing

In most scenarios, survey georeferencing is required to enable scaled measurements and multi-temporal analysis (chapter  2.1). Some sensors require direct georeferencing, which is based on knowledge of the sensor position and orientation, i.e. ULS (chapter 2.6). Other sensors, such as RGB and TIR cameras, can also be used with indirect georeferencing, which relies on GCPs. For some applications, the use of a locally defined coordinate system can be sufficient (for instance, defined by stable targets for which the inter-target distances are known).

The choice of type and distribution of GCPs is important, particularly for photogrammetry. The best results are usually achieved by using artificial (i.e. manufactured) targets as GCPs rather than natural features in the scene. In this case, GCPs should be sized such that they have widths of between ~5 and 20 pixels in the images, to provide good visibility and to enable precise measurement of their centre. Therefore, GCP size should be considered once the GSD has been determined for the flight. Different sensors (e.g. multi- and hyperspectral sensors, chapter 2.5) will require different materials to ensure a strong GCP contrast against the image background within measured wavelength bands. For instance, for a thermal sensor, materi-al should be selected based on its radiance in the thermal spectral range (chapter 2.4). One option is using black velvet, which is strongly absorbing, in front of a strongly reflecting heat foil (Westfeld et al., 2015). Natural features can also be used as GCPs, but they do not usually achieve the same accuracies as artificial targets due to their lower contrast and distinctiveness in the images.

The use of consumer-grade sensors and platforms introduces substantially more variability into UAV photogrammetric image networks than in conventional (survey-grade) aerial pho-togrammetry. This limits the application of the direct relationships between GCPs and expect-ed survey accuracy that have been developed for conventional aerial photogrammetry (e.g. Kraus, 2007). Consequently, generalised recommendations cannot be provided to determine the number, density or quality of GCPs required to achieve a specific UAV survey accuracy. Survey accuracy will depend on the location, flight pattern, environmental conditions and the sensor.

In the ideal scenario, GCPs should be distributed equally across the full survey area to provide valuable constraints for the shape estimated by the photogrammetry, as well as for georeferencing. However, this is rarely implemented because GCP deployment is typical-ly limited by practical considerations (e.g. difficulties accessing the entire survey area and limited field time). Minimal GCP deployments can be based on conventional aerial pho-togrammetry guidelines (e.g. Kraus, 2007), with GCPs in each corner of the survey area and also along the area edges. Additional GCPs within the survey area are likely to strongly

82

improve the height accuracy of any 3D model. If increasingly complex flight plans are used to strengthen the image network geometry (which generally improves the shape accuracy of topographic models, Figure 2.5-2c, d), the number of GCPs can usually be decreased. Nev-ertheless, wherever possible, more GCPs should be deployed than are needed, so that some can be used as independent check points (CP). CPs are not included within the photogram-metric processing, but are essential for providing unbiased estimates of the photogrammetric accuracy. 1.5.2 Camera settings

Sensor settings (hardware and software) should always be checked for optimised data capture. This can be particularly important when images are being acquired for photogrammetry, where areas of poor data quality can impact the overall accuracy of results (O’Connor et al., 2017; Mosbrucker et al., 2017). 1.5.2.1 Sensor size and image format

If the UAV allows the use of different cameras, then camera selection often primarily considers camera type, weight and cost. Sensor size within the camera should also be considered due to its influence on image quality. Larger sensors, with larger pixel sizes, should be preferred because they generally have less image noise (larger pixels can collect more photons, to give a better signal to noise ratio).

For many applications, recording image data in compressed file formats (such as JPG) will be sufficient. Nevertheless, if possible, capturing images in a larger RAW format should be considered, even if their subsequent processing is more involved. RAW imagery preserves all the originally captured information and allows enhancements, such as exposure correc-tion, where necessary. Thus, particularly for field campaigns with challenging light condi-tions, RAW format is beneficial. Furthermore, use of RAW format makes it possible to avoid the in-camera distortion corrections that many systems automatically implement in their compressed image output, but which may have implications for photogrammetric processing (James et al., 2020b).

83

If the UAV camera allows different lenses to be selected, then the optimum focal length can be considered. For a given sensor size, a shorter focal length gives a wider field of view, so fewer images are required to maintain the same overlap for any particular flying height. The resulting larger distances between image acquisitions lead to observations from wider angles, which can improve the height accuracy of photogrammetric products. However, this advantage is offset by accuracy reductions due to the increased GSD, and image matching tends to be less successful because the appearance of objects changes more strongly between images from wider view-points. However, fewer images will have to be captured to cover the survey area (Figure 1.5-3).

Wider-angle lenses usually display increasingly complex lens distortions that may be decreas-ingly well represented by the distortion model used in most image-based 3D reconstruction software. In most scenarios, greater distortion will just underscore the need for careful consid-eration of camera calibration for accurate measurements (chapter 1.5.2.4). However, in extreme cases such as fisheye lenses, alternative distortion and projection models may be required.

Figure 1.5-3: Influence of the focal length to flying height ratio and the base on ray intersection angles. For a similar image overlap from a given flying height, a smaller focal length (a) enables a larger base compared to a larger focal length (b). However, this leads to smaller angles between intersecting rays (b), which weakens height measurement accuracy with respect to the GSD, when compared to larger bases.

The camera focus should be set to manual to avoid focus varying during the mission, which would alter the camera’s interior geometry. This is important because photogrammetric process-ing usually assumes that the camera model, which represents the camera’s interior geometry, is the same (constant) for all images. In most data capture scenarios, the focus should to be set to

84

infinity, except for very close-range flights, or if very large focal lengths are used. Lenses with a fixed focal length are preferred over zoom lens because they generally offer a better geometric stability (Eltner & Schneider, 2015). If the geometric stability of the camera and lens is of a con-cern, then photogrammetric processing can be carried out using independent camera models for each image, but this does run the risk of over-parameterization, and results are usually im-proved if a fixed focal length lens can be used. 1.5.2.3 Exposure

In most scenarios, the camera ISO setting should be as low as possible to minimise image noise, but using the auto ISO option is usually sufficient. Only under low light conditions might higher ISO values be suitable. The shutter speed should also be as fast as possible to avoid motion blur whilst maintaining an acceptable image exposure. The aperture should be set to a high aperture number (i.e. a small aperture) to have a sufficient depth of field whilst, again, being aware of the risk of under exposure if the aperture is too small. Finally, and particularly for photogrammetric work, cameras with a global shutter (that exposes the image sensor all at once) should be preferred over cameras with rolling shutter (that sequentially exposes the sensor pixels row-wise). Rolling shutters can cause image distortions that can be especially problematic on fast moving platforms. 1.5.2.4 Geometric camera calibration The geometric camera calibration strategy should be evaluated prior the UAV flight, with options considered for either independent pre- and/or post-flight calibration, or ‘on-the-job’ self-cali-bration (Gruen & Beyer, 2001; chapter 2.2). In general, on-the-job calibration is preferred, but this requires strong image network geometries and, usually, a good distribution of accurately measured GCPs, for reliable results. For a strong image network geometry, the UAV flight pat-tern should include nadir and inclined images at different flight heights, and cross-strips for images rotated around the Z-axis (Hastedt & Luhmann, 2015). If a gimballed camera system is used and the survey area is not too extensive, point of interest (POI) flight paths can be adapted to provide spatial coverage with a wide range of oblique imagery, resulting in high-quality pho-togrammetric products due to the image network strength (Sanz-Ablanedo et al., 2020).

However, if flight pattern options are more limited, and the image network geometry is likely to be weak (e.g. a single image strip, or multiple parallel strips at the same altitude, with only few GCPs over low-relief topography), it can be beneficial to perform additional camera calibration to avoid systematic errors in the reconstructed 3D model (Hastedt & Luhmann, 2015; Harwin

85

et al., 2015; Eltner & Schneider, 2015). One such scenario might be a directly georeferenced survey with no GCPs (chapter 2.1), for which re- and/or post-flight camera calibration could be performed using a test object of known 3D coordinates (a test-range calibration; Fraser, 2001). A more flexible option is to use a temporary calibration field (Figure 1.5-4), which is imaged from different distances, angles and with camera rotations, to form a very strong geometry for a self-calibrating image network (Luhmann et al., 2019).

Figure 1.5-4: Example image capture arrangement for pre- or post-flight camera calibration with a temporary calibration field (for more details see Luhmann et al., 2019).

References for further reading

86

2 Data Acquisition

87
88

2.1 Georeferencing UAV measurements

Lasse Klingbeil

2.1.1 Coordinates ..................................................................................................................................892.1.1.1 Coordinate systems .......................................................................................................892.1.1.2 Global reference frames ................................................................................................912.1.1.3 Geodetic coordinates ....................................................................................................912.1.1.4 Map projections and local coordinate systems ..........................................................922.1.1.5 Heights ............................................................................................................................922.1.1.6 Examples.........................................................................................................................93

2.1.2 Sensors for georeferencing .........................................................................................................942.1.2.1 Terrestrial geodetic measurement equipment ...........................................................942.1.2.2 Global Navigation Satellite Systems (GNSS) .............................................................952.1.2.3 Sensors for rotational information ........................................................................... 1012.1.2.4 Trajectory estimation ................................................................................................. 103

2.1.3 Georeferencing concepts ......................................................................................................... 1042.1.3.1 Direct georeferencing ................................................................................................ 1042.1.3.2 Indirect georeferencing ............................................................................................. 1062.1.3.3 Integrated approaches ................................................................................................ 1072.1.3.4 Available systems ........................................................................................................ 108In order to provide a spatial context of the information gathered with the UAV, the measure-ments taken with the airborne sensor or the products derived from it have to be georeferenced in some way. This can be realized by the determination of the position and the rotation of the sensor directly, by the knowledge of the location of specific points in the observed environment, or by a mixture of both. This chapter introduces a variety of coordinate systems, which are used in praxis and it describes different methods to assign these coordinates to the data.

A major goal in UAV based environmental monitoring is to collect spatially distributed sensor data and to derive useful information from those data. In most cases, this information consists of

89

geometrical representations of the environment, such as 2D maps (e.g. orthomosaics), 2.5D maps (e.g. digital surface models) or 3D maps (3D models or point clouds). Please note, that within this chapter all of the above examples are called maps . Often these maps are augmented with spectral in-formation, such as colour or temperature or spectral reflectance. Although there are many different sensors and processing methods available to generate these maps, a common requirement is usually, that the maps are provided in a well-defined coordinate system. This is necessary for the integration of different data sources and models, as well as for the analysis of multi-temporal data sets.

Although any coordinate system with a given definition of the origin, the axis, and the coor-dinate type may be suitable for this, it has some advantages to use global geodetic coordinates systems, such as the ITRS (International Terrestrial Reference System). Because many different coordinate systems are usually involved in a measurement campaign (sensor coordinate sys-tems, national mapping coordinate systems, Global Navigation Satellite System (GNSS) coordi-nates), it is useful to get an overview of their concepts and relationships.

The process of assigning geodetic coordinates to the data of interest (in our case the resulting map) is called georeferencing . There are two general concepts of georeferencing:

Indirect Georeferencing. Here, objects or features with known geodetic coordinates are in-tegrated into the map generation procedure or are used to transform a local map into a global coordinate system in a post processing step. An example of indirect georeferencing is the classi-cal aero-triangulation with clearly visible dedicated targets at known positions (Ground Control Points), as it is realized in most software packages for UAV data processing.

Direct Georeferencing. Here, the global position and orientation parameters of the mapping sensor (e.g., a camera or a LIDAR) at the time instances of their measurements is determined and used during the map generation process. This method is usually used in airborne laser scan-ning, where the position and orientation of the laser scanner are determined using an advanced multi-sensor setup (GNSS/IMU unit, see below) in order to transform all laser measurements into the global coordinate system.

Sometimes, the two concepts are combined in integrated approaches . An example is GNSS-aid-ed aero-triangulation, where GNSS coordinates of the UAV at each image location are recorded and integrated into the map generation process together with a set of ground control points.

The particular realization of these georeferencing concepts depends strongly on the used map-ping sensor, available other sensors, and the application. In this chapter, we will give an overview of the concepts and about the aspects, which are common to all of them, without going too much into the details of certain mapping sensors. We will focus on the sensors which are used to deter-mine position and orientation of the UAV, as they are used in direct and integrated approaches, but also on sensors that are often used in indirect approaches. Details on the map generation process, which also include georeferencing aspects, can be found in the chapters, which are fo-cussing on Structure from Motion (chapter 2.2) or airborne laser scanning (chapter 2.6).

90

Another term, which is often used in the context of georeferencing, is ‘ registration ’. Registra-tion means in most cases the transformation of a set of spatial data (e.g. a point cloud as a result of a laser scanner) from one coordinate system into another. If a point cloud taken with a ter-restrial laser scanner from a certain position is ‘registered’ to a point cloud taken from another position, then this means usually

• the determination of the transformation parameters between two point clouds using met-hods such as ICP (iterative closest points) or using distinct 3D points, such as targets or feature points and

• the transformation of one of the point clouds using these estimated parameters.

If the target coordinate system of the registration process is a global geographic system, then this can be seen as a method for indirect georeferencing as described above. Even in the case, where the transformation parameters are determined using sensor data, as in the case of direct georef-erencing, the term registration is sometimes used. In the rest of the chapter we will no more use the term registration, but the reader should be aware of its relation to georeferencing, especially in the context of laser scanning.

The chapter is organized as follows. Firstly, different coordinate systems are described, which are usually involved in the mapping process with UAV systems. This also includes a very quick overview about global reference frames, heights, and map projections. Secondly, different sensors are described, that are usually used within the georeferencing process. A major focus is on giving an overview about the principles of GNSS receivers, but also inertial sensors as well as methods to calculate the position and rotation of a UAV are shortly reviewed. 2.1.1 Coordinates

As the purpose of georeferencing is the determination of the coordinates of sensor data or a derived product in some sort of global or at least common coordinate system, it is useful to un-derstand the variety of coordinate systems and frames and the underlying concepts. This section aims to provide a very brief overview of this equally important and confusing topic. 2.1.1.1 Coordinate systems

Coordinate systems are defined by their origin and their coordinate axis. In the context of UAVs, the involved coordinate systems are:

91

Earth Centred Earth Fixed : Global system attached to the Earth. The origin is in the mass centre of the Earth, the z-axis is parallel to the Earth rotation axis, the x-axis is going through the intersection of Greenwich meridian (0° longitude) and the equatorial plane, the y-axis is completing a right-handed coordinate system. Coordinates in this system, as for example the position of a UAV, are usually written as p e .

Body frame : Local system attached to the UAV. The origin is some point on the vehicle, e.g. the centre of mass, the x-axis is pointing forward, the z-axis is pointing down and the y-axis is completing the right-handed system (pointing to the right). Please note, that ground vehicles often have a different definition with z pointing up and y pointing to the left and that this is also often applied to UAVs. Coordinates in this system, as for example the position of a sensor on the platform, are usually written as l b .

Sensor frame : Local system attached to the sensor. Raw sensor readings are given in this system. Its definition strongly depends on the type of sensor. Coordinates in this system, as for example the position of an object detected by a scanner, are usually written as x s .

Navigation frame : Local topocentric system. The origin is the same as for the body frame, the x-axis is pointing towards North, the z-axis is pointing down (parallel to gravity), the y-axis is completing the right-handed system, pointing East. This system is also called NED (North-East-Down). The rotation information about the UAV is usually given as a rotation R n b between the body frame and the naviga-tion frame. If the UAV is levelled and the x-axis is pointing North, then the three rotation angles (roll, pitch, yaw) are (0,0,0). Please note, that if the body frame is defined in a (Forward-Left-Up) mode, as it is usually for ground vehicles, then the navigation frame is usually defined as (East-North-Up).

Figure 2.1-1: Coordinate frames involved in the georefrencing of sensors data, taken with a moving platform (e.g. UAV). Unless otherwise stated, all images were prepared by the author for this chapter.

92

The above definition of the Earth-Centred-Earth-Fixed coordinate system is a theoretical con-cept and the question arises, how these coordinates can be actually measured, as the centre of mass of the Earth is not physically accessible and neither are the axes. Apart from that, these pa-rameters change over time due to continental drifts and other geophysical effects. To account for this issue, the IAG (International Association of Geodesy) maintains the ITRF (International Terrestrial Reference Frame ), which realizes a global coordinate system by defining up to 1000 reference coordinates all over the planet and updating their values every few years based on space geodesy methods. These methods (e.g. Very Long Baseline Interferometry, Satellite Laser Ranging, and GNSS) use astronomical or celestial objects (satellites, quasars) as tie points to determine the coordinates of points on the Earth surface. Due to the regular updates, the values of a set of global coordinates need to be given with a frame description, which includes a year number. The current version is ITRF2014. However, the changes between the different versions are in the order of a few millimetres up to a centimetre and therefore mostly not relevant for most UAV applications.

Another important global reference frame is the WGS84 (World Geodetic System). It is the official reference frame of the Global Positioning System (GPS) and coordinates determined with GPS receivers are usually using this convention. Although the system is maintained by the National Imagery and Mapping Agency (NIMA) of the United States, it is nowadays very similar to the current version of the ITRF and the coordinate values can be treated as identical. Please note, that although WGS84 has the year number 84 in its name, it is still constantly updated without changing this number. This is done for maximum confusion. 2.1.1.3 Geodetic coordinates

While the ITRF is represented in Cartesian coordinates (x,y,z), global coordinates can also be given in geodetic coordinates, which are longitude, latitude and height (Figure 2.1-1). Here, the Earth figure is represented as an ellipsoid, and a point on the Earth is described by two angles, describing the east-west position (longitude, starting from the Greenwich meridian) and the angle between the local normal and the equatorial plane (latitude). The third coordinate com-ponent is the height above the ellipsoid. It is important to understand, that the values of these coordinates depend on the definition of the ellipsoid parameters (e.g. length of the two half-ax-es), and that these have to be known when using geodetic coordinates. In the example of GPS coordinates, the used ellipsoid is the GRS80 ellipsoid. Ellipsoid parameters are usually defined within a reference frame.

93

2.1.1.4 Map projections and local coordinate systems

A very common type of coordinates is UTM coordinates (Universal Transverse Mercator). UTM coordinates are the result of a so-called map projection, where the curved surface of the ellipsoid is projected to a plane, in order to create a 3-dimensional metric coordinate system. The x-y (East-North) plane represents the ellipsoid surface and the z-axis codes the orthogonal deviation from that surface. Each longitudinal strip of the Earth figure (6° width) creates a new projection plane, therefore the strip number needs to be part of the coordinate values (see ex-ample below). Because the UTM projection only describes the projection method, the reference frame and the ellipsoid parameters still need to be specified.

One of the main sources of confusion in the world of coordinates systems is that many coun-tries or continents define their own ellipsoids, as they locally approximate the Earth figure better than a global ellipsoid. This then also leads to different UTM coordinates. There are also local reference frames, addressing the specific needs of certain regions. As an example, the ETRS89 (European Terrestrial Reference System) is a European reference frame. It is derived from the ITRF in the year 1989 and has been kept nearly fix by then because mostly there are no continen-tal drifts within Europe. It also uses the GRS80 as the reference ellipsoid and together with the UTM projection it defines the official coordinate system of the Cadastre in Germany and some other European countries. Note, that due to the regular updates of the ITRF (and the WGS84) since 1989, which also considers continental drift, there is an offset of about 75 cm between the official GPS system and the official European system, which does not so much considers conti-nental drift. 2.1.1.5 Heights

In all the descriptions above, the ellipsoid is the reference surface for the height. This height definition is called ‘ ellipsoidal height ’ and it is usually the output of GNSS receivers. Another definition of height is the ‘orthometric height’ , describing the height above an equipotential surface of the Earth’s gravity field, which is called the Geoid . The shape of the Geoid is rather irregular compared to the ellipsoid, as density variations within the Earth and other geophysical effects lead to irregularities. The ellipsoid serves as a simplified model of the Geoid. However, while no water can flow between two points with the same orthometric height, this is possible for points with the same ellipsoidal height. This fact may also illustrate the importance of or-thometric height values for many geodetic or geographic applications. There are models for the difference between the orthometric and ellipsoidal heights, which is called the geoundulation . The geoundulation depends on the position on the Earth and it can vary between -100 m and

94

+100 m. In small areas (~km) the value is nearly constant. It is very important to understand, which height value is provided by any source of coordinates. National coordinates systems most-ly use orthometric heights, while GNSS receivers usually provide ellipsoidal heights. Sometimes GNSS receivers also provide orthometric heights using a simple model for the geoundulation. 2.1.1.6 Examples To demonstrate the issues of different coordinate systems, we pick the tip of the Eiffel tower in Paris and show its coordinate in different versions.

Given, that continents (e.g. Europe and America) have moved away from each other roughly 25 cm between the 2 realizations of the coordinate system, the values are still the same within the orders of centimetres, which nicely shows the benefit of regular updates.

It can be seen here, that there is a significant difference between the official European reference system and the global system, which is for example provided via GNSS measurements. Similar effects can be expected for other local systems. Also shown here is the difference between the ellipsoidal height and the orthometric height. This difference (44.55 m at the Eiffel tower) can be treated as constant in local areas, but changes for larger distances (e.g. 46.90 m at the Cologne Cathedral, ~400 km away).

Of course, the geographic coordinates also differ between the two systems. It is helpful to remember, that the 5th to 7th digits in a degree coordinate correspond to variations between 1 cm and 1 m depending on the location on the Earth and the actual direction (longitude or latitude). 2.1.2 Sensors for georeferencing

In this section we describe sensors, which are often used during the georeferencing process. We start with geodetic measurement equipment, which is usually used on the ground to determine coordinates of specific points or features. We proceed with an overview about GNSS, which can be used on the ground and also on the UAV. Finally, we describe other sensors, such as inertial sensors to determine the full position and orientation parameters of the UAV. 2.1.2.1 Terrestrial geodetic measurement equipment

We shortly introduce some of the geodetic measurement equipment, that can be used for de-termination of object coordinates on the ground. A detailed introduction into surveying tech-niques and instruments can be found for example in (Breach & Schofield, 2007).

GNSS Receivers have antennas, which are mounted on tripods or poles and they receive sig-nals from the Global Navigation Satellite Systems in order to determine a point’s position in a global coordinate frame (Figure 2.1-2 left). Because UAVs also contain GNSS receivers, their functionality and measurement principle are described in more detail in the next section.

Total Stations are instruments which can determine the direction and the distance of points in the direct line of sight of the instrument using electro-optical distance measurements. These points are often reflector prisms mounted on a vertical pole, which then can be used to measure points on the ground. However, the reflecting object can also be an arbitrary surface patch. The range can be up to hundreds of meters with an accuracy in the order of millimetres, depend-ing on the type of the instrument, on the reflecting surface and on environmental parameters. Following the notion from the section ‘coordinate systems’, the total station measures a 3D co-ordinate in its own sensor coordinate system. If an absolute (global) coordinate is needed, the position and orientation of the total station system needs to be known in the global system. Therefore, a total station cannot directly deliver global coordinates, but in combination with multiple distance and direction measurements between multiple points ( geodetic network ), where some points have known absolute coordinates, georeferencing of the full network is pos-sible. The latter procedure is called network adjustment .

96

Figure 2.1-2: Left : Terrestrial Geodetic measurement equipment on tripods (from left to right:

GNSS Antenna, Terrestrial Laser Scanner, Total Station). Right: Ground Control Point

(GCP) for indirect georeferencing. Image credits: Christoph Holst.

Terrestrial Laser Scanners (TLS) measure also directions and distances to points on surfaces and therefore provide coordinates in the local scanner coordinate system (Figure 2.1-2, mid-dle). Compared to total stations they are usually a bit less accurate and they also do not aim to measure distinct points, but rather sample the full surrounding with a high data rate, leading to a dense grid of measurements. Laser scanners usually provide the data of interest, similar as airborne LIDAR in chapter 2.6. In order to georeference the data, known points, visible in the scans, are necessary and a network adjustment as in the case of total station measurements needs to be performed. 2.1.2.2 Global Navigation Satellite Systems (GNSS)

One of the main methods for generating global coordinates is the use of GNSS receivers. Th is is the case for indirect georeferencing methods, where points in the object space are measured using GNSS devices, as well as for direct methods, where the position of the sensing system (e.g. the UAV) is determined directly. Th is section gives a brief overview of the basic principles of GNSS by explaining, how it is possible to achieve centimetre level accuracy and what eff ects lead to measurement errors. Further information on GNSS and their functionality can be found for example in (Hoff man-Wellenhoff et al. 2008).

While the Global Positioning System (GPS), which is owned and operated by the United States, serves as the system for the explanation in this chapter, other GNSS such as the Russian

97

GLONASS, the European GALILEO or the Chinese BEIDOU follow the same measurement principles.

Basic Concept The basic concept of GNSS is that a number of satellites in space with known positions synchronously transmit radio signals, which are received by some device on Earth. The time of flight of the signal, and by knowing that it travels with the speed of light, the dis-tances between the receiver and the satellites are determined. Then, multiple distances to mul-tiple known satellite positions allow the calculation of the receiver position using a trilateration method.

Satellites The so called ‘Space Segment’ of GPS consists of about 30 satellites orbiting the Earth two times per day in a height of 20 000 km (corresponding to a time of flight of the signal of about 70 msec). Each of the satellites carries an atomic clock to create a common time base and to allow for the synchronous transmission of signals. The whole system is monitored and controlled by a collection of ground stations, which is called the ‘Ground Segment’. The ground segment also determines the exact position of the satellites and potential clock offsets between individual satellites. Both are important for the operability of the system.

Correlation. The measurement principle to determine the time of flight of the signal between the satellites and the receivers (the collection of all receivers is called ‘User Segment’) is based on correlation. Based on the knowledge about the signal structure of all satellites, the receiver internally creates a replica of the satellite signal at every time of transmission and then measures the time shift between the received and the created signal using cross-correlation. Assuming, that the receiver clock is synchronized with the satellite clock, this time shift corresponds to the time of flight of the signal. In reality, the user and the satellite clocks are not synchronized, lead-ing to a ‘receiver’ or ‘user’ clock offset, which needs to be known. As this offset is changing fast and unpredictable (receivers do not have an atomic clock), it is actually an unknown parameter, which has to be estimated every time the position of the receiver is determined.

Code Observations. In order to have a sharp correlation peak in the above-mentioned cor-relation process, the signal needs to have a small autocorrelation coefficient, as it is the case for random signals. Furthermore, the transmitted signal needs to be unique for each satellite in order to distinguish different satellites. To achieve both, the signal is realized as a so-called pseudo-random-noise (PRN) code. This code is a noise-like but deterministic digital sequence, which is unique for each satellite. The sequence consists of 1023 chips (zeros and ones), and it repeats every millisecond. The chips (each of them having a length of about 300 m in vacuum) are modulated on a carrier wave with a wavelength of about 20 cm. In the code-based measure-ment, this code is reconstructed from the signal and then used for the actual correlation process. Its ‘wavelength’ of 300 m determines the possible correlation accuracy of 1–10 m. The result of a code observation is called ‘pseudo-range’ because it still contains the unknown receiver clock-offset.

98

Most GPS satellites transmit two different codes (C/A and P/Y) on two different carrier fre-quencies (L1: 1575.42 mHz & L2: 1227.60 MHz). The C/A code can be used by all GNSS receiv-ers, while the P/Y is encrypted and can be used only by military users. Its shorter chip length (10 m) leads to a higher correlation accuracy (0.1–1 m). While the above signals are specific to GPS, other GNSS have similar signals and similar carrier frequencies.

Navigation Message. In addition to the capability of measuring the time-of-flight, the receiver also needs information about the position of the satellites. This data is also modulated on the signal with the PRN code, but with a much lower bit rate. The so-called navigation message contains information about the satellite’s own position (accuracy ~1 m) and other status pa-rameters, such as satellite clock offsets. The navigation message also contains the positions of all other satellites (the so-called ‘Almanac’), which are needed before a receiver is able to provide a position (‘cold start’).

Navigation solution. If a receiver determines its own absolute position based on code measurements (pseudo-ranges) and the data from the navigation message, then this is called ‘Single Point Positioning’ and the result is called the ‘navigation solution’. For each measure-ment epoch (1–10 times per second), the receiver uses at least four satellites (leading to four ‘observation’) to estimate its position and the current receiver clock offset (four unknowns). The absolute accuracy is in the order of 3–15 m depending on the measurement conditions. This measurement mode is the standard mode for consumer grade GPS receivers as it is used in mobile phones, navigation devices and most autopilots for UAVs. Note, that no other information than what is transmitted by the satellite is needed to calculate the navigation solution.

Geodetic grade positioning. When cm accuracy is needed, as it is usually the case for direct and indirect georeferencing of UAV data, the navigation solution is not sufficient. In this case other measurement techniques, such as RTK (Real time Kinematic), PPK (Post Processing Kinematic) or PPP (Precise Point Positioning) are applied. These techniques use carrier phase observations and differential processing (except in the case of PPP) to achieve higher accu-racies.

Carrier Phase Observations. During the correlation process in the receiver, after the PRN code has been removed from the carrier wave, the actual correlation procedure can addition-ally be performed directly on the carrier wave, determining the phase shift between transmit-ted and simulated signal from the satellite and receiver, respectively. This leads to a very high correlation accuracy in the order of millimetres. These measurements are called carrier phase observations, and the higher accuracy comes with some drawbacks. The main problem is that due to the short wavelength of 20 cm, the determined time shift is highly ambiguous. We only observe a fraction of a full wavelength, while the number of full cycles between the satellite and the receiver remains unknown. These so-called integer ambiguities have to be resolved, which

99

puts additional requirements to the receiver, the measurement process, and the processing al-gorithm.

Observation errors. Both, code, and carrier phase observations are prone to measurement errors. These result from various effects, which can be classified as satellite , transmission, and receiver environment effects.

The major satellite effect is the uncertain position of the satellite at signal transmission time. The position provided via the navigation message is only 1–2 m accurate, limiting the potential accuracy of the final result.

Transmission effects are due to the refraction of the signal in the Earth Ionosphere and Trop-osphere. The Ionospheric refraction mainly depends on the sun activities and leads to distance errors of up to 100 m and can be reduced to about 20 m when the refraction is modelled based on the knowledge about current parameters provided with the navigation message. The tropo-spheric refraction depends on humidity, pressure, and temperature (weather) and introduces errors below a meter. Both errors are the bigger, the lower the satellite is above the horizon, due to the longer path through the atmosphere.

The major receiver environment errors are due to the so called multipath and non-line-of-sight effects. In the case of multipath, the signal from a satellite interferes at the receiver with a ver-sion of itself, which has been delayed due to reflection at a surface, such as the ground, a water surface, or a building wall. In the non-line-of-sight case, the direct path between the satellite and the receiver is blocked, but the signal still arrives with a delay at the antenna due to reflection or refraction at a building wall or corner. Multipath errors can reach values of several tenth of meters for pseudo-ranges and they are periodic with typical periods around10 to 30 minutes, depending on the distance between the antenna and the reflecting surface (thus, building the mean over a sufficient observation time, e.g. one hour or longer, can reduce this effect if the an-tenna is static). Non-line-of-sight errors can be even larger and they are more difficult to detect at the antenna.

Relative Positioning. The key to cm accuracy is the usage of the more accurate carrier phase observations, but only if the ambiguities inherent with these measurements are solved. In order to do so, it is necessary to reduce all observation errors to a minimum. This can be achieved by relative positioning, which includes the usage of two receivers with a maximum distance of about 10 km. The so-called master or base station is usually placed at a known and fixed position and the so-called rover is placed at the position of interest. By building differences between the observations of one satellite at the two receivers (single differences) and by building differences between the single differences of two satellites (double differenc-es), all satellite and transmission related errors can be removed or reduced. The differencing procedure, however, leads to the loss of absolute information in the subsequent position determination process, which then provides only the so-called base line vector between the

100

master and the rover. However, this is possible with an accuracy of a few centimetres, if the already mentioned integer ambiguities could be fixed successfully. Please note, that in order to derive a precise rover position in real-time, the rover needs the observations from the base station also in real-time, leading to the need for some sort of communication between the two. The whole procedure is then called RTK (Real-Time Kinematic) GNSS. If there is no communication between the two receivers, but all observations from the rover and the base are processed later in a post processing phase, the procedure is often called PPK (Post-Pro-cessing Kinematic).

Reference services. An alternative option to setting up an own base station for relative po-sitioning is the usage of a so-called reference service . Reference service providers maintain a network of base stations over large areas (or even the whole world). A receiver connects to the service via the internet to be provided with observations from a close base station. As the distance between single base stations is usually too large, the service uses a network of them and some interpolation procedures to simulate the observations of a base station, which is very close to the receiver. This simulated base station is also called Virtual Reference Station (VRS). Reference services are sometimes provided by official state authorities (e.g. in Germany ‘SA-POS’ by the Official German Surveying and Mapping) or by companies, such as Trimble, Leica, or John Deer.

Please note, that the global coordinate frame of the resulting coordinate is determined by the reference service. As the relative positioning procedure only estimates the relative vector between the reference position and the receiver, the resulting absolute position provided by the receiver is derived by adding this relative vector to the absolute coordinate of the base station. If this coordinate, for example, is provided by the service provider in the ETRS89 coordinate frame with orthometric heights, then the receiver results are also valid in this system. Please be aware of this, as the receiver might not know about the reference system of the reference station and it might state, that the results are in WGS84 with ellipsoidal heights as it does not know it better.

PPP. In areas, where no base station observations are available, it is still possible to calcu-late precise position information using carrier phase observations. This procedure is called PPP (Precise Point Positioning) and it does not rely on building differences to reduce observation errors. The main idea is to model all observation errors, which usually leads to the need for a large number of model parameters that can quickly change over time and space. Theses pa-rameters need to be provided by a PPP service provider. Additionally, the process of ambiguity fixing takes more time. As a result, cm accuracy can be achieved, but sometimes only after an observation time of about 10–20 minutes or even longer. In recent years, there has been a huge progress in real-time PPP, so it can be assumed, that in the future PPP services will offer a similar performance to RTK services.

101

Figure 2.1-3: Basic principle of relative GNSS measurements. In order to achieve cm level accuracy, observations from a second receiver (master) or a reference network is necessary.

Without these additional observations, only a navigation solution with a meter level accuracy is possible.

GNSS Receivers. Mobile phones, navigation devices and nearly all autopilot units in UAVs use code based absolute positioning. Th e receivers are small, lightweight, and cheap. So far, geodetic grade receivers, which are able to process carrier phase observations on multiple frequencies and which provide communication interfaces to other receivers or to reference services, have been in a cost range of about 10,000 € and were only used by professional surveyors. However, most recently, geodetic grade receivers have become available for less than 1,000 €. Th is has been taken up by the UAV industry and nowadays RTK/PPK receivers are commercially available on small UAVs for direct georeferencing. Remember, that in any case where cm accuracy is needed, it is necessary, that (a) the receiver is able to receive carrier phase observations, and that (b) additional GNSS observations from a second GNSS receiver at a known position or a reference service (RTK or PPK) or (3) very specifi c information about current and local observation errors (PPP) are needed.

102

While GNSS receivers provide information to locate objects in a global reference frame, the rotation between the body frame and the navigation frame of the UAV needs to be derived from other sensor modalities. Therefore, UAVs contain inertial sensors (gyroscopes and accelerom-eters) and some of them use also magnetometers or dual antenna GNSS receivers to estimate rotations.

Gyroscopes. Gyroscopes or angular rate sensors measure the angular rate is of the sensor around its sensitive axis with respect to the inertial frame. The inertial frame is the coordinate system, which is assumed to be fixed in the universe. As a consequence, a gyroscope lying motionless on the ground would still measure the Earth rotation rate ie

. However, most gy-roscopes, especially the ones used on UAVs are not sensitive enough to measure the Earth rotation. Starting from a known orientation and assuming that the gyroscope sensor coordi-nate frame is identical to the body frame, the data from a three-axis angular rate sensor can be integrated over time to derive the orientation nbR of the UAV with respect to the navigation frame. Due to the fact, that also measurement errors are integrated, the error of resulting an-gles grows over time and needs to be corrected using sensor fusion methods described below. Details about working principles and properties of gyroscopes can be found for example in (Titterton & Weston, 2004).

Accelerometers. Accelerometers measure the specific force isf acting on the sensor along its sensitive axis with respect to the inertial frame. The specific force is the non-gravitational force per unit mass, which is basically an acceleration. A three-axis accelerometer attached to a UAV free falling from the sky would measure 0 m/s 2 in any direction. An accelerometer standing motionless on the ground measures 9.81 m/s 2 antiparallel to gravity, as this is the specific force of the ground acting on the UAV to prevent it from falling further down along the gravitational field of the Earth. The readings of a three-axis accelerometer can be used in two ways. In the case of a non-accelerating platform, it is possible to calculate two angles between the platforms z-axis and the gravity vector ngravg (rotations around the gravity vector are not observable). In the case of an accelerating platform, the gravitational and the transla-tional acceleration components can be separated if the rotation of the platform is known and the translational component can be integrated twice to derive the position of the platform. For further reading regarding on the working principles and properties of accelerometers, please refer to (Titterton & Weston, 2004).

Magnetometers. A three-axis magnetometer, measuring the vector smagm of the Earth mag-netic field nmagm in the local sensor frame, serves as a compass and provides information about the rotation of the platform around the gravity axis, which cannot be observed by accelerome-ters. This is the reason, why magnetometers are integrated into most inertial measurement units

103

(IMUs). A combination of gyroscope, accelerometer, and magnetometer theoretically allows for the derivation of all three angles of rotation between the body frame and the navigation frame. However, the Earth’s magnetic fi eld in the close vicinity to UAVs is heavily disturbed by metallic components of the platform or superimposed by other mostly stronger fi eld sources such as high currents driving the engines. Th erefore, the usage of magnetic fi eld sensors on UAVs is usually avoided.

Dual Antenna GNSS Receivers. By attaching a GNSS master and rover antenna to the UAV (Figure 2.1-4) with a distance of about 20–100 cm to the UAV, it is possible to determine the baseline vector e

GNSSb between these antennas, given in the global coordinate system and with cm accuracy using RTK processing. From this vector it is possible to derive 2 angles of the rotation of the UAV with respect to the global coordinate system. Th e third angle, which is the rotation around the baseline vector itself, remains unobserved. Th e angle accuracy depends on the baseline length. Th ere are GNSS receivers on the market, which are equipped with two antenna inputs to derive these orientation angles automatically along with the position information. However, because a proper carrier phase processing chain is needed, there are so far not available in the low-cost segment. Please note, that for cm position accuracy the UAV also needs to contain a GNSS receiver, which serves as a rover, forming a longer baseline with a master station somewhere on the ground or a virtual reference station. Th is rover of the long baseline for positioning is usually the master for the short baseline for orientation determination.

Figure 2.1-4: I nertial sensors and a dual GNSS receiver can measure the rotational state of the UAV.

104

For direct georeferencing and integrated approaches, the full 6D pose information of the aircraft and therefore of the mapping sensor needs to be known at each time a measurement (e.g. an image or laser scan) is taken. In most cases this realized by integrating readings from GNSS receivers and inertial sensors (gyroscopes and accelerometer) as they are described above, using recursive sensor fusion algorithms such as the Kalman Filter.

Strapdown Integration. Starting from known rotation angles, the orientation of a vehicle can be updated using the relative information from consecutive gyroscope readings. Using this orientation, the readings from an accelerometer can be corrected by the not measured gravita-tional components, resulting in translational acceleration values, which can be integrated twice to absolute position information, assuming known starting values for velocity and position. This procedure to derive the trajectory of a vehicle is known as Strapdown Integration . The drawback of the method is, that starting values are needed and that sensor errors lead to grow-ing orientation and position errors over time. This is especially the case for low-cost sensors that are usually implemented on UAVs. To reduce these drift effects, strapdown integration is often combined with GNSS readings sensor fusion and filtering algorithms, such as the Kalman Filter.

Kalman Filter. A Kalman Filter estimates the current state (and its uncertainty) of a system, by combining knowledge about the system’s evolution (and its uncertainty) with all sensor observa-tions (and their uncertainties) about the systems state up to the current time in a statistically opti-mal way. In the case of UAV trajectory estimation, the state consists of the position and the orien-tation of the vehicle, and the observations are given by the measurements from the GNSS receiver, the inertial sensors, as well as from potential further sensors such as magnetometers. The current pose estimation is recursively updated every time a new sensor reading is available, leading to a full trajectory estimation. There are many variants of Kalman Filter implementations, which depend mainly on available sensors and their quality (measurement models) and on knowledge about the motion of the vehicle (system model). Kalman Filters are implemented in most autopi-lot systems to provide position and orientation in real-time for navigation and flight control. For these purposes, an accuracy in the order of meters and a few degrees is usually sufficient. Aerial LIDAR based sensor units also use Kalman Filters for fusing GNSS and inertial sensors, but due to the usually higher quality of the sensors and the missing requirement of a real-time estimation, the filter can be realized in a different way, leading to more accurate results in the order of cen-timeters and sub-degrees. This is necessary for direct georeferencing of LIDAR data as described below and in chapter 2.6. A detailed introduction into trajectory estimation using GNSS, inertial sensors strapdown integration and Kalman Filtering can be found in (Groves, 2013).

105

2.1.3 Georeferencing concepts 2.1.3.1 Direct georeferencing

The most prominent example of the usage of direct georeferencing is airborne laser scanning. The position and orientation of the aircraft is determined using the sensors and algorithms de-scribed above. This information is used to transform the range and distance measurements of the scanner into the global coordinate system. Another example of direct georeferencing is the use of the aircrafts position and orientation parameters within the structure from motion pro-cessing chain, in order to derive a georeferenced point cloud without the use of any ground control point.

We use the geometric model of mobile laser scanning here to demonstrate the concept of direct georeferencing (see also chapter 2.6). If a laser scanner measures an object point x s in its sensor coordinate system, then the relationship between this measurement and the same object point x e in the global coordinate system is given by the georeferencing equation : ( )( )( ) ( )( )()=++eenbb s nbsttR t R tRtexplx. p e , ( )enR t and ( )nbR t are results from the trajectory estimation, describing the position and rotation parameters of the platform with respect to the global frame. Even when these parameters are provided with accuracies in the order of centimetres and millidegrees, it is still a challenging task to link these data spatially and temporally correct to the actual map-ping sensor data. One reason for this is, that the transformation between the georeferencing sensor unit coordinate system and the mapping sensor coordinate system has to be known. The translational component of this transformation is called lever arm l b and the rotation-al components are called boresight angles (building the rotation matrix bsR in the above equation) . The process of estimating these parameters is called system calibration. It aims to reduce the systematic errors, which otherwise would be introduced into the resulting map during the direct georeferencing process. There are many methods to derive lever arm and boresight angles within dedicated calibration flights or procedures, assuming that the parameters do not change over time. There are also methods, which use observations from the actual mission to perform an in-situ calibration, which is especially of interest if the parameters cannot be considered as temporally constant. Most of the calibration procedures use objects with known coordinates or scales, such as control points, planes or features in the environments to detect and correct misalignments induced by wrong calibration pa-rameters.

106

Figure 2.1-5: Position error (y- axis) corresponding to a synchronization error (x-axis) between the trajectory and the mapping sensors, depending on the velocity of the vehicle.

Sensor synchronisation. Another important aspect in direct georeferencing is the temporal re-lation between the georeferencing sensor and the mapping sensor. In the georeferencing equa-tion, the laser measurement and the trajectory parameters have to be known at the exact same time t. Th e velocity of the platform directly determines the infl uence of synchronisation errors. If for example the sensor displacement error should be in the order of a view centimetres and the platform moves with a speed of 50 km/h, then the synchronization error should be below 2 msec (see Figure 2.1-5). Th is accuracy is not trivial to achieve without time deterministic processing components on the UAV. Th ere are two main concepts of synchronisation between mapping sensors and GNSS/IMU units. One approach, which is oft en used with high performance laser scanners, is to feed in the so-called PPS (pulse per second) signal, which usually can be extract-ed from GNSS receivers. It corresponds to the time base of the GNSS/IMU unit and hence the trajectory data. Th e scanner synchronizes its internal clock with this signal to directly assign correct time stamps to the laser measurements. Th e second approach, which is mostly used with cameras, is to extract a trigger signal from the mapping sensor in the exact moment, when it per-forms the measurement, and then to feed this signal into the GNSS/IMU unit. Th e GNSS/IMU unit stores the reception time of the signal with its own clock. A disadvantage of this method is

107

that the association of the mapping measurement and the recorded time requires some assump-tions on the reliability of the mapping sensor. It is further not always easy to extract a signal from the mapping sensor. Consumer cameras can generate this signal via the hot shoe. Low-cost laser scanners sometimes provide a signal which is synchronized with the internal measurement rate. 2.1.3.2 Indirect georeferencing In the indirect georeferencing concept, the trajectory of the mapping sensor is not directly es-timated as in the direct georeferencing case, but features or points with known coordinates in object space are used to georeference the data.

Map transformation . A simple example for this approach is to first generate a (potentially unscaled) map (3D or 2D) in an arbitrary local coordinate system using a sensor system and the needed processing steps. Afterwards, a seven parameter Helmert transformation (three trans-lational, three rotational and potentially one scale parameter) is used to transform the map to the target coordinate system. The transformation parameters can be estimated by linking known coordinates in the global coordinate system with local coordinates of the corresponding map points. This approach is often used with terrestrial laser scanners, as they create point clouds with a high internal accuracy. Also, orthophotos, which do have no or an insufficient georefer-ence can be transformed to a global coordinate system in this way (e.g. using only four parame-ters in this case: two translational, one rotational and scale).

Aerial Triangulation. However, in the case of UAV imagery and SfM based mapping, things become a bit more complicated because the known coordinates of points or features in the real world need to be incorporated into the actual mapping algorithm. As described in more detail in chapter 2.2, ground control points (GCPs) are distributed over the whole mapping area and their position is determined using GNSS or other geodetic measurement equipment. The GCPs are detected and localized in the images and then used in the SfM pipeline. As a result, the re-constructed map, and the derived products (e.g. orthophotos or digital surface models) are rep-resented in the coordinate system of the GCPs. Additionally, the trajectory of the vehicle in the form of a sequence of external camera orientation parameters (position and rotation, also in the GCP coordinate system) is an output of this procedure. Using GCPs also helps to improve the accuracy of the map by avoiding systematic reconstruction errors, such as drift or bowl effects, which are inherent to the SfM procedure (see also chapter 2.2 and 2.3). Given a sufficient image geometry, the mapping result can be georeferenced simply with only GCPs and no knowledge about the vehicle trajectory. However, nowadays most UAVs comprise GNSS receivers leading to image positions with an accuracy of at least several meters. Therefore, integrated approaches, utilizing these data, have become the standard processing approach.

108

In integrated approaches, observations of the object space, which come from the actual mapping sensor, are combined with on-board navigation sensor data (IMU, GNSS) from the UAV, in order to generate a georeferenced map and simultaneously estimate the trajectory of the UAV in the global coordinate system. The combination of known targets on the ground (GCPs) and image positions recorded with on-board GNSS receivers is the most prominent example for this approach ( GNSS aided Arial Triangulation ). Any combination between ‘no’ and ‘many’ GCPs with ‘cm-level’ or ‘m-level’ on-board GNSS accuracy is possible and has been investigated e.g. in (Gerke & Przybilla, 2016) or (Benjamin et al., 2020).

If the on-board GNSS is using only code observations, as it is the case in most UAV auto-pilot systems, the accuracy of the position is in the order of several meters. Therefore, without any GCPs, the absolute accuracy of the mapping result can also not be better. Furthermore, the in-ternal accuracy of the 3D point cloud may suffer from systematic effects, such as the bowl effect (see chapter 2.3). The additional usage of GCPs will increase the absolute and relative accuracy, depending on the accuracy and the distribution of the GCP positions.

The usage of high accuracy differential carrier-phase based GNSS receiver on the drone offers the potential of cm accurate results, even without any GCPs. However, the usage of a few GCPs still provides a higher robustness and improves the estimation of the internal camera param-eters, which is in most cases part of the reconstruction process. Especially in missions with a constant flight height, reconstruction algorithms have difficulties to separate the camera focal length parameter from the object distance and therefore a single GCP, which provides a good measure of the object distance, will help.

The position of a very slow flying drone (e.g. walking speed = 1 m/s) changes about 10 cm between two GNSS measurements. In the example of a vertical distance of 30 cm between the GNSS antenna and the camera, a tilt angle of the drone of 20° degrees leads to a horizontal shift of the camera position of about 10 cm. These two examples show, that the usage of RTK GNSS receivers on the drone in order to derive camera coordinates with an accuracy of 1–3 cm comes with some technical challenges. The synchronization between the GNSS observations and the actual time stamp of the image exposure, as well as the current spatial relation between the GNSS antenna and the camera focal point (lever-arm) have a significant influence on the accuracy.

The accuracy of UAV LIDAR systems can also be improved by integrated approaches. As de-scribed in chapter 2.6 overlapping laser observations, e.g. from various flight strips, can be used to improve the trajectory estimation and therefore the overall result.

109

2.1.3.4 Available systems

The applicability of direct georeferencing and integrated approaches mainly depends on the accuracy of the on-board georeferencing sensors and the level of integration of the mapping sensors. Some researchers presented prototype systems already in 2013, where on-board RTK GNSS receivers were used to track the position of the UAV (e.g. Turner et al., 2013; Eling et al., 2014; Rehak et al., 2014). These systems where custom designed solutions and required a deep understanding of the technical and algorithmic integration. In recent years more and more commercial systems became available, which offer the possibility of tagging the images with RTK GNSS generated coordinates. These systems, however, mostly do not allow the integration of custom sensors or other cameras than the ones provided by the vendor. These are mostly closed systems, where a sensor or a camera cannot easily be changed. The reason is, as men-tioned above, that a proper spatial and temporal relationship between the mapping sensor and the navigation sensors need to be maintained. However, it can be expected, that the number of commercially available system will increase even more in the future and that the modularity of these systems will increase as well, enabling the usage of arbitrary sensors.

References for further reading

110

2.2 Principles of image-based 3D reconstruction

Francesco Nex, Yolla Al Asmar, Claudia Stöcker and Markus Gerke

2.2.1 Principles of photogrammetry ............................................................................................... 1102.2.1.1 Pinhole camera model ............................................................................................... 1102.2.1.2 Principle of collinearity ............................................................................................. 1112.2.1.3 Single image orientation – spatial resection ........................................................... 1122.2.1.4 Stereo-pair images ...................................................................................................... 1132.2.1.5 Ground Sampling Distance (GSD) .......................................................................... 113

2.2.2 Photogrammetric workflow .................................................................................................... 1132.2.2.1 Image orientation – classical approaches ................................................................ 1142.2.2.2 Image orientation – modern approaches ................................................................ 1202.2.2.3 Image matching algorithms ...................................................................................... 125

2.2.3 Generation of end-products ................................................................................................... 1282.2.3.1 3D point clouds .......................................................................................................... 1282.2.3.2 TIN and 3D mesh ....................................................................................................... 1282.2.3.3 DSM and DTM – Digital Surface and Terrain Models ......................................... 1292.2.3.4 Orthophoto generation ............................................................................................. 129This chapter provides a general overview and a simplified description of the basics of photo-grammetry for image-based 3D reconstruction. It starts with a brief introduction of the main principles that are essential to the understanding of photogrammetry. The second and main part of the chapter explains the classical workflow, including image orientation, image matching and the generation of 3D point clouds, DSM and orthomosaics. In this context, traditional as well as modern approaches are presented. In-text references are provided for further reading and deeper insights into technical details. For a more complete explanation on the technical and mathematical details of photogrammetry please refer to (Förstner & Wrobel, 2016, Kraus, 2007, Mikhail et al., 2001).

111

Photogrammetry is the science of using 2D image measurements to extract 3D information about the position and the geometry of an object as it is shown in Figure 2.2-1. The goal of pho-togrammetric multi-image approaches is to revert the transformation process, which takes place when a 3D-scene is projected into a 2D camera image. The basic input of photogrammetry is given by two or more images acquired from different positions in space and visualising the same part of a static scene7.

Figure 2.2-1: From UAV images to 3D information

(illustrated with dataset from James et al., 2020).

The photogrammetric workflow allows processing a set of images to generate 3D geo-spatial information. The following paragraphs provide insights into the main theoretical principles that ground photogrammetry. 2.2.1.1 Pinhole camera model

The pinhole camera model describes the mathematical relationship between the coordinates of a 3D point in the object space and its projection into the image plane (Figure 2.2-2). The pinhole camera model assumes that the images capture central perspective projections of the scene. In the central perspective, the light of a scene converges into a single central point inside the sensor lens, called projection centre. In the perspective projections, the point of the object scene (A), the corresponding point in the image (a) and the projection centre (O) are arranged along the same line. These three points are also called collinear as they stand together on the imaging ray (from Latin: co = together and Linea = line). 2.2.1.2 Principle of collinearity

The collinearity principle is the basis of photogrammetry as it establishes a mathematical func-tion between each point in the image and the corresponding point in the object space. The collinearity equations are: ()()() ()()()++=+++ 110120130310320330PPr U Ur V Vr W Wxcxr U Ur V Vr W W (1)()()() ()()()++=+++ 210220230310320330PPrU Ur V Vr W Wycyr U Ur V Vr W W

Where:

x and y are the image coordinates in the camera system,

U , V , W are the ground/object coordinates of the point,

c , x pp and y pp are the interior orientation parameters of the camera: focal length, and principal point coordinates, respectively

r 11

, r 12

, r 13

, r 21

, r 22

, r 23

, r 31

, r 32

, r 33

, are the elements of a rotation matrix and are computed from the three rotation angles omega, phi, kappa rotating around W, V and U, respectively

U o

, V o

, W o are the coordinates of the image projection centre within the ground/object co-ordinate system

113

Figure 2.2-2: Collinearity principle. Unless otherwise stated, all images were prepared by the authors for this chapter.

An important finding is that while a 3D-point from the scene will be projected to a distinct point in the image (as it can be inferred from Equations 1), an observation in the image will lead to an infinite viewing ray in object space: this means that a single image cannot be used to define the position of points in the 3D object space. 2.2.1.3 Single image orientation – spatial resection

The spatial position and orientation of an image can be determined through the spatial resec-tion, using a set of points of known coordinates on the ground and their corresponding coordi-nates in the image as input. As spatial resection is not a linear process, existing methods (such

114

as Direct Linear Transform, DLT) linearise the collinearity equations and determine the final solution according to an iterative process. 2.2.1.4 Stereo-pair images

Two images can define a so-called stereo-pair. Given an object point visible in both images, its 3D coordinates can be determined by exploiting the collinearity principle and intersecting cor-responding imaging rays in space (Figure 2.2-3, right). The intersection of two or more imaging rays in the space is called forward intersection.

Figure 2.2-3: Single image and stereo pair: why photogrammetry needs at least two images. 2.2.1.5 Ground Sampling Distance (GSD)

The Ground Sampling Distance is the size of the image pixel projected in the object space. The bigger the GSD size, the lower is the spatial resolution of the image. Typical UAV-based projects have a GSD in the range between one to 5 cm. 2.2.2 Photogrammetric workflow

The photogrammetric workflow can be divided into three main steps: (i) the image orienta-tion, (ii) the point cloud extraction and surface modelling and (iii) the orthophoto genera-tion. The image orientation determines the positions and the attitude of the images in the 3D

115

object space. The point cloud extraction produces dense point clouds that allow generating detailed 3D models and meshes, i.e. surface models, while the orthophoto generation steps produce an orthoimage (or orthophoto) that can be directly used as a base for topographic mapping. In the following paragraphs, a detailed description of each of these steps will be given. 2.2.2.1 Image orientation – classical approaches

The image orientation is the process of establishing the relationship between the camera cap-turing the image, the image itself and the terrain, thus establishing the relationship between the image coordinate system and the ground/space coordinate system. The image orientation is the prerequisite to extract any geometric information from the images. In the image orientation process, three different types of coordinate systems can be defined:

1. Image coordinate system: This is a left-handed Cartesian 2D coordinate system, defined as pixel addresses by row and column numbers (r,c) . The origin point of the system is on the upper left of the image, with the positive x-axis to the right and the positive y-axis downwards (Figure 2.2-4a) and the unit is normally expressed in pixel. In analogue cameras, distances between points in the image were measured in millimetres, defining the origin of this referen-ce system in the centre of the image. For this reason, most commercial software still reports both image coordinate systems.

Figure 2.2-4: Image (a), camera (b) and ground (c) coordinate systems.

2. Camera coordinate system: This is a 3D Cartesian, right-handed coordinate system. The ori-gin point is the projection centre O, the (x, y) plane is parallel to the image plane, the positive x-axis is parallel to the flight direction, and the z-axis is the optical axis. The z value for any

116

image point in the camera coordinate system is equal to -c (calibrated focal length), as shown in Figure 2.2-4b. The unit is usually expressed in millimetre (or number of pixels).

3. Ground (terrain/object) coordinate system : This is a 3D Cartesian, right-handed coordinate system (U, V, W). It can be the national mapping system of the country or just a local coordi-nate system (Figure 2.2-4c) while the unit is usually expressed in meter (chapter 2.1).

The image orientation process can be divided into two main steps: (i) estimation of the interior orientation and the (ii) exterior orientation (Figure 2.2-5). The interior orientation defines the geometry “inside” the camera, while the exterior orientation determines the position (given by its coordinates Uo, Vo, Wo) and attitude of the camera (given by three angles around the carte-sian axes , , ) in the object space. The exterior orientation process can further be divided in (i) relative orientation and (ii) absolute orientation.

In the first step, the relative position of the images is determined. Neither scale nor positioning of the images in the object space is determined: the images are placed in the so-called photo-grammetric relative model. In the absolute orientation, the position and attitude of the images is determined in the real object space: the measures and the positions recovered from an absolutely oriented image block correspond to reality.

Figure 2.2-5: Interior and exterior orientation. a) Interior orientation

The camera model relates back to the pinhole camera assumption, resembling a perspective projection. To perform geometric computations, e.g. as shown above in the collinearity equation,

117

three main parameters of the idealised pinhole system must be known: The principal distance (or focal length in case the focus is at infinity) describes the orthogonal distance from the projection centre to the image plane (i.e. along the optical axis). The point where this virtual axis intersects with the focal plane is called principal point  – since this point is a 2D entity, two parameters are needed to describe it. Only when those three parameters are known for the camera, the geometry of the so-called bundle of rays , passing through the projection centre and imaged on the focal plane can be described mathematically. Those parameters need to be computed for each camera/lens individually, e.g. in a lab, or during bundle adjustment (self-calibration; see below).

The pinhole model, however, is an idealised model. In practice, the lines in the image do not resemble straight lines within the optical system, the image plane might be deformed, or pixels might not be strictly quadratic. Therefore, an additional set of distortion parameters are defined and computed for each camera individually, as well. In sum, the interior orientation is composed of the following parameters:

Principal distance or focal length (c) : the distance between the projection centre (O) and the focal plane.

Principle point (pp) : the orthogonal projection of the projection centre ( O ) with respect to the focal plane (also called image plane).

Lens distortions : these distortions can be radial, affine and decentring. They model the de-formations of the image plane comparing it to a regular array made of squared pixels of the same size. They are usually no dimensional, and their values depend on the model adopted for the camera calibration.

To convert pixel and metric units, the sensor size or pixel size needs to be known for digital cameras. For metric airborne cameras, all these parameters are usually provided by the camera’s manufacturer through camera calibration reports. These parameters can be directly adopted for the image orientation as they are supposed to be valid for any flight performed with this camera. Camera calibration parameters are valid for some years, and they are updated from time to time. On the other hand, the cameras installed on a UAV are usually no metric cameras. It means that the parameters determined in a calibration procedure are not stable over time as they can change every time the camera is shut down. This circumstance makes the self-calibration procedure necessary, where the camera parameters of the interior orientation are estimated together with the exterior orientation process. b) Relative orientation (of a stereo-pair)

Let’s consider the case of a simple stereo-pair. The unknown parameters of a stereo-pair are twelve in total (i.e. six exterior orientation parameters for each image), but its relative orientation can be

118

defined using only five parameters. This is possible by fixing the positions and orientations of one of the images (six parameters) and constraining the movement of the second image along the line connecting the two projection centres (base distance). The remaining parameters in the 3D space are the three rotations and two translations along the line of the second image. The relationship between images is established by determining corresponding object points in the two images. These corresponding points, known as tie-points, must have known images coordinates but still unknown ground/object space coordinates. It is important to understand that the scale within the relative oriented stereo model is arbitrary and proportional to the mentioned base distance. The interesting fact is thus that the seven remaining parameters (from five for relative orientation to twelve for full exterior orientation) resemble a similarity transform, i.e. 3D-shift and rotation and a unique scale.

To compute the relative orientation, a minimum of five tie-points are needed in the overlap-ping area of both images. The distribution of the tie-points should be as much homogenous as possible. As discussed in the following paragraphs, this process is nowadays performed auto-matically thanks to the use of algorithms capable of extracting hundreds (or even thousands) tie-points in each stereo-pair.

As depicted in Figure 2.2-6, the two straight lines (L’ and L”) and the base distance (i.e. line connecting the two projection centres O’ and O”) lay on a unique plane called epipolar plane. The epipolar plane properties and its use in the automated processing will be discussed in more detail in the following sections.

Figure 2.2-6: Epipolar plane and coplanarity of O’-O”-A.

119

c) Relative orientation (of an image block)

UAV acquisitions usually generate a huge collection of images: several hundred images can be easily acquired in one flight. The images are typically acquired according to regular pattern and overlaps, as discussed in the image acquisition section (chapter 1.5). The assembly of many over-lapping images together is usually called an image block . Each image must be “connected” to one or more images of the block through a sufficient number of tie-points. This allows propagating the relative orientation of the images among the image block.

Bundle Block Adjustment in a nutshell

The orientation of the images is further refined using the BBA. This is a unified process that simultaneously estimates the interior and exterior camera parameters as well as the 3D tie-point coordinates in a statistically optimal manner (Förstner & Wrobel, 2016). BBA is a non-linear process, and hence needs as initial (starting) orientation the “approximate” solution generat-ed by the previous relative orientation. The mathematical backbone is given by the collinearity equations (Equation 1), and the whole process aims at minimising the re-projection error on the tie-points. The optimal solution is reached by iteratively converging to the optimal estimation of the parameters.

The minimum inputs of the BBA are the images, the set of tie-points extracted from the images of the block and their approximate exterior (and interior) parameters (either in a local reference system or in absolute coordinates given by the on-board GNSS and IMU in the object space. In most BBA-approaches, the interior parameters can be estimated within the BBA, in case the cameras are not strictly metric (self-calibration, see also above). The outputs of the BBA are the six parameters of the exterior orientation for each image of the block, the coordi-nates of the tie-points with relative coordinates, and the interior orientation parameters (if not given as input).d) Absolute orientation

The absolute orientation is the process to locate the relative model obtained from images into an absolute (cartographic) reference system, i.e. to define the datum. In this step, seven param-eters are determined to scale (one parameter), shift (three parameters) and rotate (three param-eters) the relative model into the ground coordinate system. Figure 2.2-7 illustrates the absolute orientation concept.

120

Figure 2.2-7: From relative to absolute orientation (a) and use of GNSS information to absolutely position the images (b).

The absolute orientation can be achieved in two different ways: (i) using a set of GCP or (ii) using the coordinates of the projection centres given by GNSS installed on-board (chap-ter 2.1). i) The use of GCPs is the traditional way to define the absolute orientation of an image block and can be referred to as indirect sensor orientation. GCPs are points of known coordinates in the ground reference system that are visible in at least two images of the block. The absolute orientation needs to scale, translate (along X, Y and Z) and rotate (around the three axes) the relative orientation, determining a total of seven unknown parameters. It means that at least three GCPs are needed to perform an absolute orientation. Three is the minimum, but more GCPs are required to produce an accurate absolute orientation. There is no general recommen-dation on the number of GCPs as this largely depends on various aspects such as terrain, flight conditions, flight planning, land cover and camera specifications. It is, however, the case that GCPs are not only required to define the datum of the image block but also to stabilise its inner geometry (Gerke & Przybilla, 2016). GCPs help to minimise a so-called block deformation: effects which are caused by remaining uncertainties and which lead to a distortion of the block geometry (chapter 2.3).

It is essential to mention that the seven unknown parameters can be determined after the relative orientation independently (so-called free-network process) or they can be added as un-known in a BBA (being optimised together with all the other parameters). This latter approach is the one implemented in most of the commercial solutions. In this case, the GCPs are therefore added in the BBA process, inserting their image and corresponding ground coordinates. With this information, a relation (i.e. transformation) between the relative and absolute reference sys-tem can be established. The mentioned effect of GCPs helping to minimise block deformation can only be derived with this option.

121

ii) The position given by the navigational unit of UAVs can also be used to determine the absolute orientation. Typically, UAVs have an on-board positioning unit exploiting one or more satellite constellations (GPS, Glonass, Galileo, BeiDou, etc.). In most cases, the camera trigger is synchronised with the on-board GNSS, storing the position where each image has been acquired. The accurate position of the images would allow georeferencing the scene, making the acquisition of further GCPs obsolete. However, most of the UAVs are equipped with low-cost receivers that can provide an approximate position of the UAV, with a few meters accuracy. In this case, the absolute orientation is not accurate enough to allow for a good georeferencing. A growing number of UAVs installs GNSS with RTK correction, providing solutions with few centimetres accuracies. However, this theoretical accuracy can be often worsened by the insufficient synchronisation accuracies between camera trigger and satellite receiver that introduces systematic shifts in the coordinates (Gerke & Przybilla, 2016) or by the incorrect assessment of the relative position between GNSS and camera (i.e. lever-arm and boresight alignment). The size of these errors depends on the UAV and can potentially prevent the accurate georeferencing. Thus, it is recommended to use a few GCPs, even when accurate RTK or PPK corrections are used: four GCPs in the four corners of the image block are often sufficient to adjust and relocate the position of the block. The use of GCPs is also recommended to improve the camera self-calibration (especially when using long focal lengths). One should also keep in mind that even with RTK or PPK corrections, a good synchronisation and the lever-arm calibration, the absolute positioning will not be better than 2–3 cm in all three coordinate axes. The inner accuracy, however, of the image block might be much better, i.e. in the order of 1 GSD or even smaller. In some project, thus, a GCP survey of better quality, using total stations and a surveying network might be necessary. 2.2.2.2 Image orientation – modern approaches

The development of automated algorithms in Computer Vision has changed the paradigm of image orientation in the last years. The focus of research in the Computer Vision domain has been oriented on how to derive scene geometry from uncalibrated cameras without any pre-knowledge of the camera parameters. The concepts described in the previous sections are still valid and in use, but many parts of the photogrammetric processing have been modified and improved. The main differences between the traditional and modern approach are reflected in higher flexibility and automation in the image orientation process. In the following sections, a brief description of the main aspects of contemporary methods is given.

122

a) Automated tie-point extraction

The manual extraction of tie-points on large datasets, like the ones acquired by UAVs, re-quests a huge amount of time that is completely incompatible with productive requests. For this reason, the development of algorithms like SIFT (Lowe, 2004), SURF (Bay et al. 2006), BRISK (Leutenegger et al., 2011), etc. has given a significant contribution in the automation of large im-age blocks processing. These algorithms work according to the three following steps: (i) feature extraction, (ii) feature description and (iii) feature matching. The first step allows to identify the most proper points (also called key-points) to be used as tie-points in the orientation; the second step defines a descriptor that gives a unique description of the area around each of these key-points; comparing the descriptors from overlapping images, the points are finally matched in the third (and last) step.

Feature extraction . This step should select points that are particularly suitable for their matching in other images. In particular, two typologies of points can be detected: corners or blobs. Corners represent a well-defined radiometric discontinuity identified by an image gra-dient in two perpendicular directions (Figure 2.2-8a, b). Blobs are small image regions, which share similar image properties such as brightness or colour (Figure 2.2-8c, d) and have at least one radiometric extreme, positive or negative (i.e. bright or dark). The size of the blob is defined by the intensity in the region.

Figure 2.2-8: Example of a corner (a) and corner detection in an image (b); an example of a blob (c) and blob extracted in the same image (d).

The great majority of recent approaches are extracting blobs instead of corners as these regions are more invariant to scale, illumination and (partial) geometric transformations of the same region in images acquired from different positions. While corners are extracted searching for radiometric gradients on the input image, the extraction of blobs is usually performed on the so-called image octaves and image scales. As an example, in the SIFT algorithm (Lowe, 2004), image octaves are progressive down-samplings of the original image (Figure 2.2-9a). An im-age scale is then a sequence of images generated by applying repeated and progressively larger smoothing (Gaussian) filters on the input image (Figure 2.2-9b). The blob is usually detected

123

considering the difference of adjacent images in the same scale (Figure 2.2-9c) and then search-ing for local extremes across scales, (Difference of Gaussian DoG) (Figure 2.2-9d).

Figure 2.2-9: Example of octaves (a), image scales (b), Differences of Gaussians (c) and local extremes search across three image scales (d).

Feature description. Once the features have been extracted in the images, a description must be used to allow the identification of the same points in images depicting the same area. In general, a descriptor should contain/summarise information about the immediate neigh-bourhood of a feature and transform it into a compact vector of numbers. Descriptors are ex-tracted considering either radiometric gradients or intensity comparisons on these regions. Two main typologies of descriptors exist; float and binary descriptors, according to the way the information is stored in the vector. Float descriptors are a collection of floating numbers, while binary descriptors are a string of binary numbers. According to the implementation, descriptors can be rotation, illumination, scale and (partially) affine invariant. The size of the area considered to generate the descriptor and the way to build it depends on the considered implementation.

Feature matching. Once a complete set of key-points and their corresponding descriptors have been generated, the corresponding points (i.e. tie-points or homologous points) in overlap-ping images need to be matched. Here, each descriptor of one image should be compared to all the descriptors of the other image. The similarity is usually computed considering the “distance” (i.e. the difference) between each corresponding number of the descriptor. As an example, in the SIFT algorithm, each descriptor is composed by 128 numbers: each of these numbers is compared with the corresponding number of the other descriptor and the Euclidean distance is computed summing these differences together. b) Fundamental matrix

The relative orientation of a stereo-pair is defined by five parameters. In the modern approach, the relative orientation of a stereo-pair can be “summarised” in a 3x3 matrix called Fundamental Matrix (Faugeras & Maybank, 1990). This matrix is able to map the position of the homologous

124

points in the two images, embedding part of the interior orientation (image distortions are not included) in this process.

The fundamental matrix embeds the concept of the epipolar lines. According to this principle, homologous points and their corresponding point in the space lay on a unique plane. Given two relatively oriented images and their corresponding Fundamental matrix, for each point in the first image, we will not be able to determine the position of the corresponding point in the second image, but we will be able to define the line (called epipolar line) where that point will be projected in the second image. This concept is explained earlier and depicted in Figure 2.2-6.

The computation of the Fundamental matrix can be formed using the 8-point algorithm as described in detail by (Hartley & Zisserman, 2004). The general idea encompasses that the tie-points generated by feature extraction and matching allow the determination of putative corre-spondences between the two images: these correspondences can then be used to estimate the Fundamental Matrix.c) RANSAC (Random Sample Consensus)

Points matched using features and descriptors can often be wrong because of the presence of repetitive patterns (i.e. windows on a building façade, etc.), bad image quality, lack of distinctive features in the image, etc. These wrong matches can negatively affect the estimation of the Fun-damental matrix and, therefore, need to be removed. In this regard, several statistical approaches have been implemented to perform this task. Among them, RANSAC (Fischler & Bolles, 1981) is a general approach that allows eliminating outliers reliably and robustly. The starting point of RANSAC is a mathematical model (in this case, the one given by the Fundamental matrix esti-mation) that defines the “behaviour” that the points should follow. RANSAC works according to an iterative approach: at each iteration, a minimum number of points (i.e. eight points) are used to estimate the Fundamental matrix. Afterwards, the number of inliers is counted. A pair of matched points is considered as inlier if – using the computed Fundamental matrix and the position of the point in the first image – the epipolar line computed in the second image is close to the corresponding homologous point. After many iterations, the solution with the largest number of inliers is selected as the correct one: all outliers are finally removed from the dataset. RANSAC allows to remove up to the 40 % of outliers from the dataset and, although the algo-rithm is almost 40 years old, it is still one of the most used solutions.d) Structure-from-Motion (SfM)

Structure from Motion is the technique to relatively orient a sequence of images together, to generate an image block. The outputs of SfM are the orientation parameters of the images (i.e. Motion) defined in a relative system, the interior camera parameters due to camera self-calibra-tion (if needed) and the position of the matched tie-points (i.e. structure) in the object space that

125

defines a 3D model of the scene structure. There are very different implementations of SfM, but the easiest implementation can be divided into four subsequent steps (according to Pollefeys et al., 2004):

Match or track points over the whole image sequence . This process usually performs the already described feature extraction and matching.

Initialise the structure using two images, suitable for initialisation . One of the two frames is defined as the origin of the reference system and reference for the angular measurements (i.e. the orientation angles are all 0). Using these first two images the structure is initialised (i.e. the first points are matched and their positions estimated via forward intersection in the local system defined by the image pair).

Add new images to the sequence . For every new image, its position is inferred applying spatial resection and using the matched points with the other images and roto-translating it in their reference system. The orientation of the new image is then refined using a Bundle Block Adjustment. In many implementations, this step is just a local BBA, considering only the neighbouring images and not the complete image block. Every new image allows for the forward intersection of further matched points and thus, new observations that can be used in the orientation of the next image.

Run the BBA . A Bundle Block Adjustment is finally run to refine both the image positions and the points positions (i.e. sparse point cloud) of the whole image block.

As all sequential processes, the SfM can run into serious problems of error cumulation: small er-rors in the relative orientation of a single stereo-pair can cumulate in a sequence of stereo-pairs, leading to large deformations of the image block. Therefore, the Bundle Block Adjustment is a fundamental step in SfM as it allows to achieve more precise results, keeping the deformations lower. In SfM the image orientation is often defined by the Projection Matrix. This matrix em-beds both the interior and exterior parameters, allowing to move from image coordinates to object space coordinates and vice-versa.

SfM has been originally conceived for terrestrial image sequences, with images acquired in a sequential way without knowing their position. If no a priori information is available, then the feature extraction and matching must be performed considering all the other images increas-ing the computational effort of the image block orientation. Another common way to process them was to concatenate the images together considering the time of acquisition: each image is supposed to be in overlap with the neighbouring ones. More advanced approaches have been re-cently implemented (Schönberger & Frahm, 2016) to cope with unordered sequences of images.

Most UAV platforms store GNSS information of the location where each image has been ac-quired. This information can clarify in advance which images overlap and which ones are too far

126

away and do not have overlap. In this way, the tie-point extraction can be performed on a limited number of images instead of using the whole image block. Different strategies can be adopted for this purpose: one example is given in Figure 2.2-10. Starting from the initial stereo-pair, a radius is considered to define the next image of the sequence. Simple criteria like the “closest not already oriented image” can be used to define where to move the concatenation process. It must be noted that multiple matches (i.e. tie-points visible in more than two images) are extremely important to make the block more “rigid” and to reduce deformations.

Figure 2.2-10: Concatenation strategy in a UAV acquisition (nadir image case). 2.2.2.3 Image matching algorithms

The term image matching refers to the techniques to identify and match identical objects fea-tures in overlapping images in order to reconstruct their position in the space. Two main types of image matching exist: the feature matching and the area-based matching . Feature matching has been already described in the image orientation section: the aim of this process is to detect some well recognisable features to be used as tie-points in the orientation process. The results provided by feature matching is a sparse point cloud generated intersecting the tie-points in the object space. On the other hand, the area-based matching aims at maximising the number of matched points to generate a dense point cloud. Many area-based matching algorithms were developed in the last decades. The basic idea of these methods is to associate many pixels in a reference image to the corresponding (homologous) points in the search image. In ideal cases, a corresponding pixel in the second (or slave) image should be found for each pixel of the ref-erence image (assuming that no occlusions are hampering). In order to define the similarity between corresponding pixels, an image patch centred on the pixel to match is extracted in the first image. This patch is then compared to a sliding window of the same size in the second image (see Figure 2.2-11): the window is translated one pixel per time in the horizontal and vertical di-rection. The similarity measures (such as normalised cross-correlation) are computed in all the

127

positions to define the degree of similarity among image patches. The location of the maximum of correlation is finally selected as the corresponding match.

Figure 2.2-11: The basic idea of area-based matching using a reference and a slave image and one-dimensional explanation of correlation (right).

Many image matching algorithms adopt different measures of the similarity, such as: mutual information, mean squared differences, etc. As it can be easily understood, this process can lead to many wrong matches as many parts of the picture can look similar analysing a small patch. On the other hand, the same point can appear completely different in two images because of varying illumination conditions or viewing angles. For this reason, many strategies to reduce the number of incorrect matches has been implemented. This very long development can be summarised in four key-advances.

Use of the epipolar constraint. Given two oriented images, the epipolar constraint assures that the homologous point in the second image will lay on the epipolar line, which reduces the search space in the slave image to a 1D problem.

Multi-resolution point cloud generation. The use of the epipolar constraint is often not suf-ficient to prevent the selection of wrong matches as the epipolar line is still very long. For this reason, the use of multi-resolution images can partially solve this problem by iteratively refining the reconstruction in the 3D space progressively using higher resolution images. The images are initially down-sampled, and the image matching is performed using lower resolution images: as the resolution is lower, larger areas are covered by each patch (as the GSD is larger) decreasing the likelihood of ambiguities or to extract wrong correspondences. The set of matches generated in this phase is then used to build a first rough model (Model 1 in Figure 2.2-12). The second iteration considers higher resolution images and is guided by the approximate model generated in the previous step reducing the search area in the matching process. The process is repeated until the full resolution images are used as input to generate a full-resolution model. Three to five image resolutions are usually used in this process.

128

Figure 2.2-12: Multi-resolution approach in a stereo pair (please note the shorted epipolar line) and progressive reconstruction of the model from coarser (using low-resolution images) to more defined (using full-resolution images).

Multi-image matching. In photogrammetric applications, images are usually acquired using high overlaps, and more than two images capture the same point of the ground. In multi-image approaches, this redundant information is used to improve the quality and reliability of the achieved results. For a reference image, cross-correlation values are checked with more than one slave image at the same time, and their information is merged in order to have unique and more robust information (Figure 2.2-13).

Figure 2.2-13: Epipolar geometry in a multi-image approach.

Semi Global matching. Semi-global approaches (Hirschmüller, 2005) fuse the information pro-vided by the correlation on a local patch with the contextual information provided by neigh-bouring pixels. The general idea of these approaches is to look at the consistency of each putative match considering the position of the adjacent matched pixels: in a 3D reconstruction, each point should typically be close to other points. In Figure 2.2-14, red and green points are good candidate matching along the projection line, but only the green one is consistent with the posi-tion of the neighbouring points. From a mathematical point of view, this information is stored in the so-called cost function that is minimised. The cost function “forces” the generation of locally flat surfaces, penalising irregular changes (i.e. sudden depth variations) from the position of

129

neighbouring points. At the same time, large depth variations are allowed and through another parameter in the penalty function. In general, this leads to the preservation of e.g. roof edges. Figure 2.2-14: Consistency of matches in object space: the green point is close to other points in the reconstructed space. 2.2.3 Generation of end-products 2.2.3.1 3D point clouds

Image matching algorithms are able to generate huge point clouds. Point clouds are collections of points matched in the object space (chapter 3.5). Each point is generally defined by three co-ordinates in a local Euclidean ( U, V, W ) or cartographic (East, North, height) reference system. As images usually store the RGB colours, the matched point can also incorporate this informa-tion: in this case, each point is defined by three coordinates and three colour values. Point clouds represent the direct output of image matching, but they are often unpractical for their delivery to final users. In this regard, the point cloud can be converted in other formats. 2.2.3.2 TIN and 3D mesh

A TIN is a digital representation of a continuous surface consisting of non-overlapping/nonin-tersecting triangular elements. TIN is obtained by connecting the points of a point cloud that

130

become the vertices of each triangle. TINs are usually produced from aerial nadiral acquisitions: each X and Y value can have only one Z value.

3D mesh represents a generalisation of TIN as it represents the surface of a generic object (e.g. landscape, statue, buildings) in the space: in this case, each X and Y value can have multiple Z values. 3D meshes are usually textured by adding the radiometric information of the images. 2.2.3.3 DSM and DTM – Digital Surface and Terrain Models

DSM and DTM are digital representations of the Earth surface stored in raster data format with a regular grid/pixel and each pixel containing one elevation/height (for more details see chap-ter 3.4). A DSM is a geometric model of the earth surface and includes the elevation of the to-pography and all the natural (i.e. trees) and human-made (i.e. buildings, bridges, etc.) objects. A DTM contains the elevation of the terrain.

The reduction of a DSM into a DTM requires to identify the objects that protrude outside the ground. Many algorithms have been developed in the last decades, however without delivering accurate results in all the operational conditions. The general idea of these approaches is that ground points are lower than the others, although several exceptions may occur in hilly areas. DTM extraction using photogrammetric data such as UAV images is more challenging than using LiDAR data because of the higher number of outliers and the unavailability of multi-echo information that allows penetrating inside vegetated areas, providing the position of the ground in hidden regions (Gevaert et al, 2018). 2.2.3.4 Orthophoto generation

The images acquired during a UAV flight cannot be directly used to infer the position and geom-etry of the object on the ground. As already discussed, the images are central perspective and (i) the size of pixels (i.e. scale) on the ground varies according to their positions in the image frame and (ii) the appearance of the objects looks distorted according to different viewing angles.

To determine metric information from images, we need to generate an orthophoto. The ortho-photo (or Ortho-image) is a geometrically corrected (“orthorectified”) image such that the scale is uniform: it combines the characteristics of the image and the geometry of the map offering the same metric information. Orthorectification is the process of projecting the image content into the surface on the ground, removing central perspective distortions, and transforming the image information in an orthogonal projection. It can be divided into four main steps, which are also summarised in Figure 2.2-15. The process is repeated for each pixel of the orthophoto.

131

1. The orthophoto is initially a blank image. As it is shown in Figure 2.2-15, the orthophoto has the same planimetric coordinates of the DSM and the same resolution: each pixel in the DSM corresponds to a pixel in the orthophoto. This is not always true, but in general, these resolutions are similar and, as a rule of thumb, the orthophoto pixel size (GSD) in units on the ground should not be smaller than the size of the pixel in the original image.

2. Each pixel of the orthophoto is initially projected into the DSM. In this way, the height component of this pixel is determined.

3. The point in the DSM is then back-projected into the image using the collinearity equati-on. A radiometric value from the original image is finally interpolated. Nearest neighbour , bilinear interpolation and cubic convolution are typical interpolation methods used in the orthophoto generation.

4. The interpolated value on the original image is finally stored in the orthophoto.

Figure 2.2-15: Main scheme of the orthophoto generation process.

The orthorectification process can be performed using the DTM or the DSM as altimetric sup-port. Traditionally the orthorectification was performed using the DTM: in this case, only the elements on the ground are geometrically corrected and can be used for measurements. All the other elements (buildings, trees, etc.) are only projected on the DTM and are still distorted and displaced from their correct position. These elements cannot be used for extracting any geomet-ric information. The increased automation and reliability of dense point cloud generation has

132

eased the use of DSM in the orthorectification process. The product generated using the DSM takes the name of true-orthophoto : the true-orthophoto rectifies all the elements producing a ge-ometrically corrected representation of the whole scene (i.e. all the elements can be measured). Software

References for further reading

134

2.3 Uncertainty in image-based 3D reconstruction

Mike R. James and Stuart Robson

2.3.1 Random error in photogrammetric image networks .......................................................... 1362.3.1.1 Photogrammetric considerations ............................................................................. 1362.3.1.2 Georeferencing ........................................................................................................... 1372.3.1.3 Survey precision estimates ........................................................................................ 137

2.3.2 Systematic error in photogrammetric image networks....................................................... 1402.3.2.1 Photogrammetric considerations ............................................................................. 1402.3.2.2 Georeferencing ........................................................................................................... 1422.3.3 Quantifying uncertainty .......................................................................................................... 1432.3.4 Reducing uncertainty through survey design and processing ........................................... 1452.3.5 Case study: topographic change detection ........................................................................... 1472.3.5.1 Survey design and execution .................................................................................... 1482.3.5.2 Accuracy of the image networks .............................................................................. 1482.3.5.3 Accuracy of the geospatial products ........................................................................ 1492.3.5.4 Repeatability of the geospatial products ................................................................. 1522.3.5.5 Change detection ........................................................................................................ 1532.3.6 Summary ................................................................................................................................... 154Uncertainty in photogrammetric geospatial output arises from error in the input measurement data and errors in the geometric model describing the light ray paths from object to image, and their geometric interaction with the camera. Here, we focus on the causes and characteristics of uncertainty associated with topographic point coordinate output, which underpin derived geospatial products (e.g. DEMs and orthomosaics). We aim to enable uncertainty to be success-

135

fully reduced, and to be considered appropriately within subsequent analyses. By focussing on geometric aspects, we leave uncertainty in radiometric considerations (which are particularly important for applications such as thermal surveys) to chapters 2.4 and 2.5.

Uncertainty estimates describe the likely magnitude of error on measurements, and quantify the qualitative concept of measurement ‘accuracy’. A measurement error (a measured value mi-nus the associated accepted reference value (JCGM, 2012)) can be considered as comprised of a random component, the likely magnitude of which is described by the measurement precision, and a systematic component that is represented by ‘trueness’ (Figure 2.3-1). For clarity in ter-minology, we draw on the rigorous definitions provided by the ISO measurement community (JCGM, 2012; ISO, 1994; JCGM, 2008):

Accuracy: a qualitative term describing the closeness of agreement between an individual measurement and the accepted reference value (ISO, 1994) or the true value (JCGM, 2012). Accuracy encompasses contributions due to both measurement precision and trueness.

Precision: the distribution of repeated measurement values obtained under stipulated condi-tions. Precision reflects the impact of random components of error and does not relate to the true or accepted reference values. Quantitative estimates of precision can be provided using statistics such as standard deviation.

Trueness: the closeness of agreement between the average of a large number of measure-ments and the accepted reference value (the difference between these values is expressed quantitatively as ‘bias’).

Figure 2.3-1: Schematic illustration of the relationships between error and uncertainty terms.

Adapted from Menditto et al. (2007).

136

For direct measurements (e.g. determining an object’s weight with a mass balance), random and systematic errors are often clearly distinguishable from each other and can be handled appropriately. In contrast, photogrammetric measurements, such as point coordinates, are generated from numerical modelling of indirect observations (i.e. the ‘bundle adjustment’, or optimisation, of an ‘image network’ of image feature coordinates, chapter 2.2). The resulting interdependencies give rise to complex relationships making random and systematic errors challenging to isolate.

Geospatial output is typically derived from the optimum least squares estimated parameter values of the equations that comprise the photogrammetric model. However, complex parame-ter inter-relations may result in poor estimates for some parameters, and associated systematic error in others (e.g. Figure 2.3-2). Uncertainty in geospatial results therefore reflects contribu-tions from both errors on the input data and limitations and weaknesses in the photogram-metric model. A clear separation of random and systematic error components is not usually possible. The complexities involved mean that uncertainty is not only likely to vary between survey sites, but also between repeat surveys of the same site, and spatially across individual surveys. Thus, uncertainty assessments should be clear about their limitations, whether they cover repeatability and replicability and, consequently, how generalised and transferrable they may be to other surveys.

Figure 2.3-2: Ray diagrams illustrating the modelled positions of topographic points, S, reconstructed from observations in three photographs. (a) The true, i.e. error-free, scenario in which all rays are coincident for each point (rays only shown for one illustrative point). (b) In real images, image observations are associated with random error, ε 1-3

, which is propagated into the topographic point coordinate estimates. (c) Error in estimated camera parameters, e.g. in principal distance, ε f

, results in additional systematic error (bias) in point coordinate estimates. Unless otherwise stated, all images were prepared by the authors for this chapter.

137

The photogrammetric model can be considered in two components: a stochastic model and a functional model. During photogrammetric processing, the stochastic model describes the expected distribution of random error on both the observations and the estimated param-eters (chapter 2.3.1) and provides precision estimates for all model parameter values. The functional component describes the underlying optical physics (including the collinearity equations, chapter 2.2). However, omissions or weaknesses in the functional model are not generally identifiable within the precision estimates and usually result in spatially correlated, systematic error across a survey (chapter  2.3.2). For both random and systematic contri-butions, uncertainty can also be considered in terms of photogrammetric components (i.e. related to the image network geometry that defines the underlying shape of the model) and georeferencing aspects (which scale, orient and locate the model in the real-world coordinate system). 2.3.1 Random error in photogrammetric image networks

Random error is introduced into photogrammetric image networks by the finite precision of the observations (i.e. the input data for the bundle adjustment). In SfM-based processing, most observations are provided as the automatically measured tie point image coordinates that form the majority of the image network. However, georeferencing requires additional observations (e.g. GNSS measurements of control points or camera positions, chapter 2.1), which are associated with their own error characteristics that must be accommodated in the bundle adjustment. 2.3.1.1 Photogrammetric considerations

The precision of tie point image coordinates is a function of the algorithms used to identify them, and the local image content and micro-contrast (texture). For example, the centroiding algorithms used within engineering metrology, which are tuned for locating well-illuminated circular artificial targets, can provide feature coordinates that are good to ~0.02 pixels (Gruen, 2012; Trinder, 1989; Shortis et al., 1995; Dold, 1996). In contrast, the robust feature-based algo-rithms such as SIFT (Lowe, 2004) or SURF (Bay et al. 2008), typically used for locating natural image features from UAV survey data (which are unlikely to be geometrically simple and may vary between image acquisitions in environmental scenes, e.g. vegetation moving in the wind), generally provide image feature coordinates that correlate across images to ~0.1–0.5 pixels (Re-mondino, 2006; Barazzetti et al., 2010; Ahmadabadian, 2013).

138

Within the image network, random error on the image observations contributes directly to random error in the 3D tie point coordinates (e.g. Figure 2.3-2b), and un-modelled residual error is represented by tie point image residuals. Thus, photogrammetric contributions to tie point precision result in point coordinate values being sensitive to such error. Where tie points have only a few poor observations (i.e. large image residuals) in images acquired from similar directions, their coordinate precision will be weak. Tie point precision is strength-ened through acquiring high quality observations (i.e. giving small image residuals) in many images, from diverse directions. Whilst a geometric solution in which shape on an arbitrary coordinate datum is possible using tie points alone, photogrammetric and image measure-ment considerations also apply to image observations of ground control points (GCPs) used for georeferencing. 2.3.1.2 Georeferencing

The control observations used for survey georeferencing (e.g. GNSS measurements of ground targets or camera positions) define the scale and orientation of the photogrammetric model with respect to the external coordinate system (chapters 2.1 and 2.2). Uncertainty in such observa-tions is thus propagated through this datum definition to point coordinate estimates by the sto-chastic model, and presents as spatially correlated (i.e. systematic) uncertainty in georeferenced topographic results. Good georeferencing precision is achieved through control data that tightly constrain this transformation, e.g. a large number of carefully measured control points, widely dispersed across the survey. Considering georeferencing effects alone, the position of optimum point coordinate precision is located at the weighted centroid of the control measurements. Geo-referencing contributions to precision will steadily weaken away from the centroid, as extrapo-lated scale and orientation errors increase and may well be compounded with other systematic biases resulting from the imaging configuration used (chapter 2.3.2). 2.3.1.3 Survey precision estimates

Estimates of survey precision given by the stochastic model therefore cover contributions from both photogrammetric and georeferencing aspects. The magnitudes and directions of random error may be spatially systematic, and coordinate estimates for any one tie point (i.e. X , Y and Z values) will be inter-dependent. Such inter-dependencies can be described by a coordinate covariance matrix for each point, which enables precision estimates to be visualised as oriented 3D ellipsoids (or projections of them; Figure 2.3-3).

139

Figure 2.3-3: Illustrative precision ellipses for 3D points observed (dashed lines) in different numbers of variously oriented cameras (triangles). Many, converging observations result in good precision (small ellipses) and few, near parallel observations provide weaker precision (larger, elongated ellipses). Redrawn and adapted from James et al. (2017).

Software that provides coordinate covariances from network adjustments (i.e. precision ellip-soids, Figure 2.3-3) can be particularly informative, but valuable information can also be gained through spatial patterns in precision magnitudes (James et al., 2017; James et al., 2020). Fig-ure 2.3-4 gives indicative examples of patterns that reflect different relative influences of photo-grammetric or georeferencing contributions to survey precision.

Where a georeferenced datum is strong, estimated point coordinate precision may be limited by photogrammetric aspects such as tie points having few or poor observations. An example is the use of many, well distributed control points with externally measured coordinate input precisions that are similar to the coordination capability of the photogrammetric network (Fig-ure 2.3-4d, left column). Precision variations, both within and between surveys, may thus reflect differences in photogrammetric aspects: areas of reduced image overlap, or areas of steep faces or vegetation where tie points may have few, low quality observations or where observations are from a restricted range of angles (e.g. Figure 2.3-3).

Weaker georeferencing, for example through using fewer GCPs (Figure 2.3-4d, column sec-ond from left), more poorly constrained GCP observations (Figure 2.3-4d, column third from left), or at locations increasingly far from the control centroid, leads to point coordinate preci-sion weakening systematically across the surveyed area. Regions of best precision become in-creasingly focussed around the centroid of the control point distribution (e.g. Figure 2.3-4d, column third from left).

Considering coordinate precision magnitudes can help enable improvements or optimisa-tions in future surveys and be used within uncertainty-bound change detection between sur-veys (chapter 2.3.5). However, precision covariance, including covariance between points, is a function of the datum definition, and not yet generally considered in geospatial products. For the most accurate analyses, both are needed to consider precision-limiting processes fully.

140

Figure 2.3-4: Variation of Z-coordinate precision for a survey of a low-relief proglacial river,

Arolla region, Switzerland. (a) Oblique overview showing all camera positions (blue rectangles). (b) Orthoimage, with the area of interest outlined in red; coordinates in Swiss National Grid.

(c) DEM of the area of interest, cropped to remove wetted areas. (d) Z-precision of tie points from processing all images (top row) and a reduced image set from parallel-only flight lines (bottom row) as a function georeferencing strength. For georeferencing with GCPs, red triangles show the GCPs used in the bundle adjustment; in the right-hand column, the GCP precision has been weakened by a factor of ten compared to in the left and central columns. Where the number of control points has been reduced, note that some are retained at the survey boundaries so that tie points are interpolated within the bounds of the control data. The right-most column (boxed) shows precision estimated without GCPs in the bundle adjustment. These should not be directly compared with the results on the left because the transformation to the geographic coordinate system has to be determined independently and introduces uncertainty that is not represented within the figure. Panels (b) and (c) adapted from James et al., 2020, under a CC BY license

(https://creativecommons.org/licenses/by/4.0/). 2.3.2 Systematic error in photogrammetric image networks

Although spatial correlation can be evident in tie point coordinate precision estimates (particu-larly related to georeferencing, Figure 2.3-4d), other contributions to systematic error are not related to precision, so are not included in the stochastic model and cannot be reliably estimated internally through the bundle adjustment. Such error may only be reliably identified through comparison with an externally measured reference such as coordinates, lengths, surface profiles or complete surfaces. As for precision estimates, systematic error generally comprises both pho-togrammetric and georeferencing contributions. 2.3.2.1 Photogrammetric considerations

Systematic photogrammetric error can result from the functional model being either insuffi-cient or poorly determined. A model is insufficient when it omits a part of the physical imaging process with non-negligible effects. For example, for most UAS surveys imaging from ≲100 m above the ground, including an atmospheric refraction correction (Mugnier et al., 2013; Kraus, 1993) may not be necessary. However, omitting refraction modelling might produce measurable

142

systematic error when imaging over kilometre-scale distances (Fraser, 1993). Another example is correction due to Earth curvature (Mugnier et al., 2013; Kraus, 1993), which, depending on the desired accuracy of the UAS survey, may become significant for survey dimensions ≫100 m. Earth curvature does not present a direct error in the photogrammetry itself, but introduces deformation if field-surveyed control measurements are included without conversion into a true Cartesian coordinate system. Most processing software will carry out this conversion if the con-trol measurements’ coordinate system (e.g. WGS 84) is identified and associated with the control data prior to bundle adjustment. For the majority of UAS surveys, it is generally expected that all the required physical processes are sufficiently represented within the processing software’s functional model.

More commonly, photogrammetric models from UAS surveys are limited by weak image network geometry that leaves some model parameters, particularly those associated with the internal camera imaging geometry, highly interdependent and difficult to define uniquely. ‘Conventional’ aerial survey designs, comprising grid-style parallel imaging developed for use with purpose-built survey mapping cameras, typically represent weak UAS image net-works. In practical UAS surveys following such designs, the bundle adjustment equations often present a complex set of poorly distinguishable local optimisation minima into which the adjustment can converge, resulting in parameter estimates with substantial uncertainty, including serious systematic error (Luhmann et al., 2019). These issues can be particularly problematic when camera self-calibration is included in the processing of weak networks, as is commonly the case for UAS surveys (chapter 1.5). In contrast, ‘strong’ image networks, such as highly convergent multi-image networks traditionally used for camera calibration or industrial measurement (Fraser, 2001; Fraser, 2013) are represented by a bundle adjustment solution with a clear optimisation minimum, allowing rapid convergence to accurate param-eter estimates.

Typically, errors in estimated camera model parameter values correlate with systematic er-rors in the modelled topographic surface shape, or with estimated camera positions and ori-entations. For example, near-parallel imaging directions of relatively low-relief topography (e.g. elevation variations of < 10 % of the flight height) can be associated with poor camera principal distance estimates and are susceptible to correlation between estimated radial lens distortion and tie point Z -coordinate values. Estimated principal distance error correlates with error in estimated flight height above the surface and consequently, for directly geo-referenced surveys, can present a systematic Z -offset of topographic results (Benassi et al., 2017; Grayson et al., 2018; Przybilla et al., 2020) (e.g. Figure 2.3-2c). In such cases, at least one GCP, preferably more, are required within the adjustment to improve camera principal distance estimates.

143

Error in estimated radial lens distortion typically results in a curved ‘doming’ error on the topographic surface (Figure 2.3-5a). These topographic errors may not be readily observable within the random error patterns seen through topographic coordinate precision estimates when internal correlations between photogrammetric model parameters are high. Within rel-atively low-relief topography surveys (Carbonneau & Dietrich, 2017; Griffiths & Burningham, 2019; James & Robson, 2014; Javernick et al., 2014; Sanz-Ablanedo et al., 2020), for which such issues tend to be greatest and most obvious, survey-wide systematic error may be modelled and removed (Carbonneau & Dietrich, 2017; James et al., 2020; Sanz-Ablanedo et al., 2020). Surveys of higher-relief areas, giving greater variation of observation distances within and be-tween images, usually represent stronger image networks (i.e. with less-correlated parameters), resulting in smaller magnitude but more complex error distributions (James et al., 2020; Nesbit & Hugenholtz, 2019). The likelihood of important systematic error can be assessed by consider-ing correlations between estimated parameters but, usually, these are only provided for camera parameters representing the lens model. 2.3.2.2 Georeferencing

Including control measurements in the bundle adjustment (as either ground control point coor-dinates or camera positions and orientations) is a standard approach to help mitigate systematic error resulting from photogrammetric aspects. However, control measurements also have asso-ciated error, which can propagate adversely into the image network if not handled appropriately. Control measurements should be provided with their precision estimates at a minimum. In pro-fessional-grade mapping and survey software it is established practice to provide their associated variance-covariance matrices so that anisotropic survey errors are accounted for. Such software can also allow the network adjustment to be extended to include other types of control obser-vations (e.g. GNSS line lengths and angles from total station observations), along with their measurement uncertainties.

Weak image networks covering areas of low-relief topography are particularly vulnerable to er-ror in control measurements because they can deform relatively easily to accommodate the error. For example, error on widely separated or highly weighted GCPs can result in both local and sur-vey-wide systematic error in topographic results (Figure 2.3-5b, c). When georeferencing is carried out using camera position data, caution is advised to ensure that some ground-based measure-ments are also acquired and withheld from the adjustment for use as an independent check.

Comparison between estimates of check point coordinates from the bundle adjustment and their field-measured values help in identifying any systematic datum-related issues that might be represented as survey-wide systematic error. Differences include translations, tilting or scale

144

errors within topographic results that might be manifest across different comparative data and influence decisions when assessing landform change. Comparisons in which this can be most obvious are between UAS photogrammetric survey and aerial LiDAR, or satellite altimetry data. If permanent GCPs can be used for repeat surveys, issues related to datum-definition will be minimised in inter-survey comparisons.

Figure 2.3-5: Examples of systematic error as illustrated by Z-difference to an accepted reference survey of the Arolla site (Figure 2.3-4b, c). Red triangles represent GCPs used as control points in the bundle adjustment (filled symbols) or as check points (open symbols). (a) Residual doming/ dishing due to correlation between lens distortion parameters, despite well-distributed ground control. (b) Irregular errors highlighting changes in image overlap within a weak survey resulting from over-weighting of control observations. (c) Systematic error resulting from a simulated blunder in a GCP ground survey measurement (arrowed GCP vertically offset by 100 mm).

(d) Error resulting from GNSS difficulties in a directly georeferenced survey (the error shown represents the remaining differences following a best-fit translation to the check points). 2.3.3 Quantifying uncertainty

For rigorous use of survey results, uncertainty estimates must accompany geospatial products such as dense point clouds, DEMs and orthomosaics, and be propagated into subsequent analy-

145

ses. Communicating and accounting for uncertainty is complicated from both photogrammetric and georeferencing aspects by the spatially varying precision and bias components involved. Furthermore, photogrammetric geospatial products are derived from the dense image match-ing process that follows bundle adjustment. This processing does not change the underlying photogrammetry, so initial image network uncertainty estimates will remain relevant. However, independent verification of final geospatial products is usually warranted due to the additional image matching, which will extend into local areas in which there are few tie points. The subse-quent point cloud averaging and interpolation involved in producing DEMs and orthomosaics, will also play a part.

Options for quantifying uncertainty in geospatial products vary depending on the level of detail required and the available effort:

• The achieved accuracy of the image network can be assessed as the most straightforward op-tion, based on (1) misfit to independently measured check points within the network and (2) precision estimates derived from the bundle adjustment. Such assessments should be con-sidered as the minimum acceptable; insight into any issues is limited because the precision estimates exclude any effects of bias, there tend to be few check points available, and neither approach considers the geospatial products directly.

• The accuracy of a geospatial product can be estimated through comparison with independent measurements such as check points and GNSS profiles, or with wider established datasets with sufficient quality to represent an accepted reference surface (e.g. a LiDAR dataset or a previously acquired orthomosaic). This can provide a detailed accuracy assessment if a large number of comparisons can be carried out, but requires extensive additional survey, which might be too costly or impractical.

• The repeatability of geospatial products can be determined through inter-comparison of repea-ted UAS surveys made at a higher temporal resolution than the anticipated landform change. This approach is not widely used due to the effort required to carry out multiple repeat sur-veys (e.g. with some exceptions (Goetz et al., 2018; James et al., 2020; Sanz-Ablanedo et al. 2020)). However, in combination with an accuracy assessment (as above), this represents the gold standard in determining the uncertainty in geospatial products.

Whichever method is adopted, comparisons must be dispersed widely across the survey to iden-tify spatial variability, and can be provided as 3D values for products such as point clouds, or 2D or 1D for orthomosaics and DEMs (i.e. Z -coordinate differences to check points, or reference profiles or surveys). The resulting differences can be communicated as (see chapter 2.3.5 for examples):

Statistical distributions , illustrated through histograms and quantile-quantile (Q-Q) plots for assessing the normality of errors and identifying outliers.

146

Summary statistics for characterising survey-wide performance, such as root mean square error (RMSE), mean absolute error (MAE), mean, standard deviation for normally distri-buted error, and non-parametric equivalents such as median, quantiles, and the normal-ized median absolute deviation (NMAD (Hohle & Hohle, 2009)). Note that metrics such as RMSE combine components of both trueness and precision, so should be provided along with others that can isolate bias contributions. Ideally, summary statistics should be accom-panied by measures that describe their uncertainty, such as a confidence interval (Hohle & Hohle, 2009).

Map-style visualisations are required to reveal spatial relationships, which cannot be identi-fied through statistical distributions or summary statistics. 2.3.4 Reducing uncertainty through survey design and processing

Effective survey planning is underpinned by an image acquisition design that is capable of achieving the required survey accuracy. For historical airborne surveys with crewed aircraft and purpose-built metric cameras, consistencies in equipment, acquisition and processing procedures enabled relatively straightforward relationships between design parameters and expected survey accuracy (Kraus, 1993). Similar relationships (e.g. accuracy or precision ra-tios with viewing distance or GSD) have been derived for SfM-based surveys (Eltner et al., 2016; James & Robson, 2012; Mosbrucker et al., 2017), but have been found reliable only as broad guides, with literature reviews (Mosbrucker et al., 2017; Smith & Vericat, 2015) illus-trating that error magnitudes achieved deviate around forecasts over ranges of up to an order of magnitude. For UAS-based surveys, the large variety of systems, image acquisition geome-tries, image quality and the effectiveness of the camera modelling, limit such direct relations to first-order estimates.

More effective forecasts of UAS survey accuracy tend to be based on the operator’s specific prior experience of the equipment under similar survey scenarios. Under favourable conditions, high-quality surveys may deliver results with a Z -uncertainty of ~1–2 GSD and better in plan, representing approximately twice the magnitude of precision estimates determined from the bundle adjustment (James et al., 2020). Such results are usually achieved where best practice guidelines have been followed for control deployment (chapter 2.1) and for providing a strong image network geometry (chapter 1.5). The inherent weakness of image networks from tra-ditional grid-style aerial survey flight paths (particularly for surveying relatively low-relief to-pography in combination with a requirement for camera self-calibration) can be strengthened by incorporating images from convergent viewpoints (Harwin et al., 2015; James & Robson,

147

2014; Sanz-Ablanedo et al., 2020) preferably from different heights (Carbonneau & Dietrich, 2017; Fraser, 2001), or by using a survey strategy that continuously varies the acquisition angle (Sanz-Ablanedo et al., 2020).

If a survey does not achieve its design accuracy requirements, then the first steps are to identi-fy whether the issues are trueness- or precision-dominated and to quantify the photogrammet-ric and georeferencing-related contributions (e.g. see the tests given in chapter 2.3.3). Such in-sight will guide processes to improve existing surveys by adapting data processing and enhance survey design for future work (Table 2.3.1).

Table 2.3.1: Strategies for reducing uncertainty in photogrammetric products. End-to-end consideration of uncertainty is particularly important when designing optimal sur-veys for quantifying topographic change. Here, we provide a design example aimed at enabling the detection of surface change exceeding 50 mm over an ~100 m × 200 m area of rugged to-pography (covering a vertical relief of ~85 m; Figure 2.3-6a). We assess the achieved survey per-formance by comparing the results of repeat surveys over a period of negligible surface change against those of a reference survey.

149

2.3.5.1 Survey design and execution

Survey design was based on orthogonal intersecting flight lines, using a 20° forward camera inclination to strengthen the image network. In combination with the strong topographic relief, the resulting convergence within the imaging geometry should facilitate robust camera self-cali-bration and help avoid dominant systematic error. To meet the design requirement of a 50-mm level of detection (at a confidence level of 95 %), repeated surveys should aim for a precision of ~18 mm (Brasington et al., 2003; Lane et al., 2003). Under favourable imaging conditions, given minimal vegetation within the area and the limited spatial extent of the survey, this may repre-sent ~one GSD. Thus, for a camera with a 2.4 µm pixel pitch and a principal distance of 8.8 mm, a flying height of ~66 m was required. Ideally, a full coverage of images should be acquired at this elevation above ground, with additional imagery collected to strengthen the image network as required. However, it is often not possible to maintain a consistent height above ground over rugged terrain, and a practical solution may require compromises such as tolerating greater uncertainty in some survey areas.

For control and check measurements, 39 artificial GCP targets were deployed as evenly as practically possible (Figure 2.3-6). Their coordinates were recorded using survey-grade GNSS on deployment and retrieval, enabling measurement quality and target stability to be verified by comparing differences within each measurement pair. All but one GCPs showed coordinate differences in line with GNSS quality estimates (means of 10 mm in the horizontal and 19 mm in vertical). The outlier GCP had differences an order of magnitude greater and was identified to have been located on unstable ground so it, and the surrounding area, were excluded from all analyses.

The UAS survey was carried out with a Phantom 4 Pro quadcopter and repeated five times. An additional flight provided a reference survey, which was processed with all the stable GCPs incorporated into the bundle adjustment as control measurements. Using this SfM-based survey to generate a reference dense point cloud facilitated direct comparisons but is subject to similar error processes as the reference survey. Consequently, TLS data were also collected to illustrate comparison with an independent method. 2.3.5.2 Accuracy of the image networks

All images in each network were of sufficient image quality and content to process automati-cally, with only a few of the lowest altitude images being subsequently removed due to limited distributions of tie points. For an example survey, photogrammetric processing reported that a 17.5 mm mean GSD had been achieved, with an RMS image residual of 0.39 pixels, and

150

a mean number of observations per tie point of 6.9. The strength of the convergent image network was reflected by a precision estimate of 0.7 µm (0.008 %) for the camera principal distance and small correlations (magnitudes of ≤0.03) between radial and decentring lens distortion parameters.

Misfit to independent check point coordinates suggested that the example survey met de-sign requirements, achieving mean values of 0.5, -5.1 and 3.0 mm in X , Y , and Z respectively, and associated standard deviations of 5.1, 8.1 and 17.6 mm (0.3, 0.5 and 1.0 GSD). No clear survey-wide bias could be observed, although one control GCP was evident as a potential outlier; Figure 2.3-6c).

The tie point precision estimates from the bundle adjustment were free of the clear systematic, survey-wide variations symptomatic of georeferencing-based limits, highlighting the potential for reducing GCP deployment in future work (Figure 2.3-6d). However, their non-normal dis-tributions (Figure 2.3-6d, inset histograms) had summary statistics that, in Z particularly (medi-an values of 4.7, 4.8 and 10.3 mm in X , Y , and Z respectively), were substantially more optimistic than the achieved misfit on check points (17.6 mm). 2.3.5.3 Accuracy of the geospatial products

For a direct assessment of geospatial output, a cloud-to-cloud 3D comparison was carried out between the dense point cloud from the example survey and the reference point cloud using the M3C2-PM algorithm (James et al., 2017; Lague et al., 2013) (Figure 2.3-7). Note that survey georeferencing was estimated using control data only. No subsequent transformations were ap-plied to refine cloud-to-cloud registration which, in practice, would have the undesired effect of confusing areas of change with stable areas. 3D vectors of difference between the point clouds were calculated for each point of the reference cloud and averaged over a 0.2-m-resolution grid for visualisation (Figure 2.3-7c).

The computed M3C2 difference values were not normally distributed, but their summary sta-tistics (NMAD values of 3.2, 4.0 and 16.2 mm in X , Y and Z respectively, equivalent to ~0.2 GSD in X and Y , and 0.9 GSD in Z ; Figure 2.3-7a, b) were broadly in line with those of check point misfits. However, visualising the spatial distribution of point cloud differences reveals regions of local systematic bias (Figure 2.3-7c). Thus, although summary statistics for both check point misfits and for dense point cloud differences to a reference survey suggest that the design re-quirement had been met, clear spatial systematics can be observed, and the survey-wide use of summary statistics that assume normally distributed data should be avoided.

151

Figure 2.3-6: Network accuracy assessment for an example survey of a high-relief proglacial forefield, Arolla region, Switzerland. (a) Perspective overview showing camera positions and

20° forward inclination along the orthogonal flight lines. (b) Orthomosaic, with area of interest outlined in red, and DEM showing GCP locations as red symbols. (c) 3D misfit to control

(triangles) and check (circles) points as a map and histograms. (d) Tie point coordinate precision estimates from the bundle adjustment, with inset histograms. Panel (b) adapted from James et al., 2020, under a CC BY license (https://creativecommons.org/licenses/by/4.0/).

152

Figure 2.3-7: M3C2 differences between the example and reference dense point clouds. Coordinate difference for X, Y and Z directions as (a) histograms overlain with normal distribution curves parameterised by either the mean and standard deviation (black) or the median and NMAD

(red), (b) quantile-quantile plots, and (c) maps of spatial distribution.

With this comparison between similarly derived SfM-based point clouds, some of the differenc-es in Figure 2.3-7 will reflect error in the reference dataset itself. Best practice would promote comparison against data acquired using an independent technique to an order of magnitude smaller uncertainty, but this is rarely practical or possible. A high-resolution aerial or ground-based photogrammetric survey could be used, but TLS data are often considered to represent a gold-standard benchmark. Comparison of the example survey with a TLS point cloud (Fig-ure 2.3-8) shows evidence of the same systematics observed when using the SfM-based refer-ence (Figure 2.3-7c). However, the different degrees of smoothing and occlusion in the SfM and TLS datasets (Figure 2.3-8c, d) indicate that care must be taken when comparing results from different methods. Assessments of surface change should always consider the influence of the observed 3D landforms on the measurement capabilities; for example, to avoid inappropriately

153

attributing differences due to applying different methods, from different observation distances and directions, to either error or to geomorphic change.

Figure 2.3-8: Comparison between TLS and SfM point clouds. (a) Histogram of M3C2

Z-differences. Curves are normal distributions parameterised using mean and standard deviation

(black) or median and NMAD (red). (b) Spatial distribution of the Z-differences (c.f. Figure 2.3-7c, which compares the example SfM point cloud with that from the SfM reference survey). Grey arrow shows the location and direction of the 3D oblique views (c) of point cloud excerpts and their Z-differences, (d) Section X-X’ (see c) through a large block showing differences in smoothing and data coverage between the TLS, SfM and SfM reference (ref.) point clouds. 2.3.5.4 Repeatability of the geospatial products

The repeatability of the dense point cloud output was assessed by comparing differences between the reference point cloud and point clouds from the repeated surveys. The mean differences

154

(Figure 2.3-9a) demonstrated a very similar pattern of systematic bias as observed for the ex-ample survey (Figure 2.3-7c) and were of greater magnitude than inter-survey variability (Fig-ure 2.3-9b). Thus, spatial systematics were repeatable between different surveys, and represented a consistent pattern of bias.

Figure 2.3-9: 3D point cloud differences between five repeated surveys and a reference survey.

(a) Mean point coordinate differences, showing systematic error, particularly in Z, that generally exceeds the inter-survey variability (b). Note the different colour scales used. 2.3.5.5 Change detection

For rigorous identification of topographic change between point cloud datasets, cloud-to-cloud differences need to exceed local thresholds of detectability that reflect the measurement uncer-tainties (Brasington et al., 2003; Lane et al., 2003; Wheaton et al., 2010). M3C2-PM (James et al., 2017; Lague et al., 2013) uses precision to estimate such levels of detection, LoD 95 %

(for a 95 % confidence level), and only where 3D cloud-to-cloud differences exceed these values are they representative of significant change. Our repeated surveys (over a period of negligible surface

155

change) enable us to assess the uncertainty estimates; analyses should be as sensitive as possible, whilst not erroneously indicating surface change.

Between our example SfM survey and reference survey, using tie point precision estimates from bundle adjustment (i.e., one-sigma estimates) within M3C2-PM gives LoD 95 % values with a median of 21 mm (Figure 2.3-10a). However, these LoD 95 % values leave clear areas erroneously identified as significant change (Figure 2.3-10b, left panel), reflecting the bias due to systematic contributions to uncertainty that are not included in precision estimates (e.g., the systematics observable in Figure 2.3-7c). Using twice the precision values (i.e. two-sigma estimates) within M3C2-PM has been previously found to improve results (James et al., 2020), and makes the comparison appear more reasonable, but with some spatial systematics remaining (Figure 2.3-10b, right panel).

The complexity of these systematics indicates that they cannot easily be accounted for by a straightforward analytical model (such as may be used to mitigate doming (James et al., 2020; Sanz-Ablanedo et al., 2020)). However, the systematics are also evident within the mean dif-ferences to the reference from repeated surveys (e.g., Figure 2.3-9a). Correcting our example survey for this bias (by subtracting a mean error surface derived from the other surveys) brought differences closer in line with those expected from one-sigma precision estimates and slightly exceeded expectations if two-sigma values were used. 2.3.6 Summary

Through their derivation from optimised model parameter estimates rather than as direct measurements, all photogrammetric output contains systematic and random components of uncertainty, from both model and input error. The accuracy of geospatial photogram-metric output from UAS surveys is influenced by environmental, equipment, imaging strat-egy and processing effects. Survey-to-survey variability of these factors affect the magni-tude of random and systematic errors. High-quality surveys can achieve Z -uncertainty of ~one–two GSD. Separating the random error and systematic bias contributions can enable modelling and reduction of systematic aspects but may require extensive external measure-ments that entail considerable effort. To guide improvements in survey design, insight into survey-limiting factors can be gained through considering uncertainty in terms of photo-grammetric components (which affect the shape of the topographic output) and georeferenc-ing components (which reflect how the output data are represented in the world coordinate system).

To support rigorous assessment of surface change, photogrammetric software should enable input measurements to be associated with a priori variance-covariance matrices. All estimated

156

Figure 2.3-10: Detection of 3D differences between the example SfM survey and the reference SfM survey. (a) 3D level of detection between the surveys, LoD 95 %

, determined by MC32-PM for tie point precision estimates based on one (left column) or two (right column) sigma. (b, c) Significant 3D differences between the surveys (i.e. where differences exceed the local 3D LoD 95 %

); white represents cropped areas and the tie points for which differences do not exceed LoD 95 %

, (their corresponding proportion of all tie points in the area of interest is given by the ‘no diff.’ percentage). Differences are shown for the original example survey (b) and for the example survey following bias correction by subtraction of mean error derived from four similar suveys (c).

Bias correction brings the survey to be closer statistically in line with the reference survey, when considering precision-based levels of detection.

157

parameter values (e.g., topographic coordinates) should be considered in association with their a-posteriori variance-covariance estimates. Patterns of coordinate variance and covariance should be visualised, compared with expected topographic change and, ultimately, values may be used to determine spatially variable levels of change detection.

References for further reading

158

2.4 Thermal-infrared imaging

Emile Faye, Audrey Jolivot, Jérôme Théau, Jean-Luc Regnard, David Gómez-Candón

2.4.1 Principles and functioning of thermal imaging ................................................................... 1602.4.1.1 The theory of thermal-infrared ................................................................................ 1602.4.1.2 How thermal-infrared cameras work? .................................................................... 1602.4.1.3 Which TIR cameras are available for UAVs? .......................................................... 162

2.4.2 Acquisition processes with TIR UAV .................................................................................... 1632.4.2.1 The flying platform ..................................................................................................... 1642.4.2.2 Ground measurement devices .................................................................................. 1642.4.2.3 Camera settings and flights planning ...................................................................... 168

2.4.3 TIR images (pre-)processing .................................................................................................. 1702.4.3.1 Thermal radiometric corrections ............................................................................. 1702.4.3.2 Orthomosaics and geometric corrections using TGCPs ...................................... 1712.4.3.3 Correcting emissivity values for each object in a thermal map ........................... 1722.4.3.4 Surface temperature cross-validation ...................................................................... 173

2.4.4 Analyzing TIR images acquired by UAV .............................................................................. 1732.4.4.1 Relative or absolute surface temperatures ............................................................... 1732.4.4.2 Image co-registration and data fusion .................................................................... 1732.4.4.3 Object-based analysis ............................................................................................... 1742.4.4.4 Vegetation indices using the TIR band ................................................................... 1742.4.5 Challenges and limits............................................................................................................... 175Infrared thermography is a non-invasive method that uses a thermal imager (thermal camera) to detect radiation (heat) emitted by all objects above absolute zero temperature and converting it into temperature. Thermal cameras provide a continuous distribution of surface temperature, called thermogram, that makes possible to detect heat-producing objects invisible to the hu-

159

man eye (Vollmer & Möllmann, 2010). Major developments in infrared thermography over the past decade significantly improved its application in various domains: military (guidance sys-tems and engine detection), electronic (detecting overloaded electrical circuits), surveillance, re-search and rescue of people, disease control (Covid-19 fever), wildlife survey (animal detection), medical (assessment of circulatory disorders), building inspection (heat losses), etc. Among these applications, infrared thermography has also been increasingly used in various fields of environmental sciences such as animal (Briscoe et al., 2014) and plant physiology (Still et al., 2019), agronomy (Maes & Steppe, 2012) and landscape ecology (Scherrer & Koerner, 2010).

Thermal cameras embarked onboard UAVs can harvest thermal-infrared (TIR) images re-motely, providing low-cost approaches to meet the critical requirements of bridging fine spatial and temporal resolutions with the covering of large environmental scenes. Autonomously op-erated, flying low and slow, UAVs equipped with TIR cameras offer scientists new opportuni-ties for measuring and studying thermal environments. By this mean, the spatial variability of temperature across an ecosystem or at the organism level (e.g., plant) can be acquired over large areas and at a greater level of detail (Figure 2.4-1) compared to ground-based thermal imagery or usual recording with temperature loggers. In addition, the price of TIR cameras and UAVs are continuously decreasing, while their management and maintenance become more automatic and simpler. Consequently, these flying systems are currently becoming more affordable and accessible.

However, recording appropriate thermal data using TIR cameras onboard UAVs is not straightforward, as many pitfalls must be bypassed along the acquisition process. Indeed, de-pending on the objectives of the study to be addressed, either to retrieve the accurate surface temperature of an object of interest (Gómez-Candón et al., 2016), or to map and compare the thermal heterogeneities at the landscape scale (Faye et al., 2016a), or simply to detect endother-mic animals (Chrétien et al.,2016; Burke et al., 2019), the use of TIR cameras onboard UAVs raises some major issues that have to be taken into consideration before flying.

For instance, TIR imaging devices that meet the constraints of weight and energy consump-tion of UAVs are based on microbolometric sensors that are not stabilized at a constant tem-perature, resulting in instability and drift in the temperature recording. Moreover, the spatial resolution of the TIR sensors restricts the flight planning for thermal mapping and makes the mosaicking of TIR images less accurate in the photogrammetry process than for Red Green Blue (RGB) images. Temperature measurements performed by TIR cameras are also affected by the physical properties of the studied object such as his capacity to emit in the thermal band (i.e., the object emissivity). The ratio between object size and the spatial resolution of the image, the thermal contrast between the target and its environment also impact TIR measurements. Several parameters external to the studied object affect the values recorded by the thermal sensors such as the ambient atmospheric conditions (e.g., air temperature and humidity, atmospheric pres-

160

sure, wind speed) and the presence of fog, dust or smoke (Meier et al., 2011). Varying weather conditions during the acquisition fl ight (e.g., cloud passes) will also have strong impacts on TIR readings. In studies requiring accurate temperature measurements, these eff ects must be min-imized and corrected using a proper radiometric calibration, which requires the acquisition of meteorological data.

Figure 2.4-1: Examples of thermal-infrared images captured on-board UAVs for addressing environmental issues. 1 and 2 show the RGB and TIR images, respectively. (A) Agroecological landscape in the center of France. (B) Riparian and stream ecosystem in Quebec, Canada.

(C) Hot springs in the Sajama altiplano, Bolivia. (A) and (C) © CIRAD – E. Faye.

(B) © Centre de géomatique du Québec – P. Ménard. All rights reserved.

161

This chapter details the TIR cameras functioning and settings, specificities of TIR flight plan-ning, radiometric calibrations and geometric corrections, meteorological recording, orthomosa-icking of TIR images, and illustrated TIR data analysis from object-based and thermal landscape analysis to vegetation indices. Finally, we illustrate some of the challenges and limits of using TIR cameras onboard UAVs that remain to be overcome in the future. 2.4.1 Principles and functioning of thermal imaging 2.4.1.1 The theory of thermal-infrared

Infrared thermography is an imaging method that records the radiation emitted by an object at wavelengths ranging from 7.5 to 14 μm in the electromagnetic spectrum. Object radiates in the thermal-infrared band as a result of the molecular motion that relies on its temperature [1]: the hotter the object, the more its molecules move, the more the object emits in the thermal-infra-red. According to the Planck’s law, the radiations emitted by a perfect blackbody (i.e., a theoret-ical object at thermal equilibrium that absorbs all radiations) will depend only on its tempera-ture, whatever its composition or shape (see [1] for details). Under the same conditions, any real object will emit radiation as a proportion of the blackbody radiation. This ratio is characterized by the emissivity of the object ( ε) that illustrates its effectiveness to emit TIR radiations. Emis-sivity values varies between zero and one depending on the chemical composition and physical structure of the object. For instance, the emissivity of plants ranges between 0.95 and 0.99, high-er when they contain more chlorophyll and water, with an average of 0.98 meaning that their surface emits 98 % of the energy emitted by a perfect blackbody at the same temperature (Rubio, 1997). Consequently, any object that has a temperature above the absolute zero (-273.15°C or 0 K at which all molecular motion stops) will emit a define quantity of radiations in the TIR band depending on its temperature and emissivity. 2.4.1.2 How thermal-infrared cameras work?

TIR cameras are imaging devices that deliver a visual representation of the thermal radiation emitted by objects. Because of their low load capacity, UAVs need to carry light-weight, small size, low power, and uncooled thermal cameras, in where the TIR sensor is not stabilized to a constant temperature (Kelly et al., 2019). The functioning of uncooled TIR cameras is based on a microbolometer sensor (Figure 2.4-2) made of an array of pixels built in an absorbing mate-rial that has a temperature-dependent electrical resistance (commonly silicon or vanadium).

162

When TIR radiations heat the detector material, the electrical signal variation is measured and compared to the value at the operating temperature of the sensor. By taking into account the ambient temperature and object emissivity, these changes in electrical signals are converted into temperature values that are displayed as monochrome or false-color images (i.e., 1 band), visible by human eyes.

One major diff erence with optical RGB cameras is that the lenses of TIR cameras cannot be made of glass, as glass blocks the TIR radiations. Th us, lenses are made of specifi c materials (such as crystalline silicon or fl uoride) that is one of the main reasons explaining the high cost of TIR cameras (between 1000 to 10,000 US dollars). Another specifi city of using uncooled TIR cameras is the lack of internal temperature control system (conversely to cooled TIR cameras) that bring instability and strong drift in temperature acquisition by the microbolometers (Me-sas-Carrascosa et al., 2018). In order to reduce this drift during operation, and consequently to reduce the inaccuracies in temperature measurements, uncooled TIR cameras are equipped with a self-calibration system taking advantage of an internal reference source that regularly up-dates the off set parameters (Olbrycht et al., 2012). Th is self-calibration harmonizes the response signal across the entire sensor and reduces the inaccuracies due to the sensor temperature de-pendency (Mesas-Carrascosa et al., 2018). Th e TIR camera inconveniences must be taken into account to retrieve accurate surface temperatures (see 2.4.3.).

Figure 2.4-2: Schematic of a silicon-based microbolometer pixel.

Reproduced with permission from SPIE – the International Society for Optics and Photonics: Yon et al., 2008.

163

2.4.1.3 Which TIR cameras are available for UAVs?

Choosing a TIR camera for UAV applications depends on the spatial resolution and thermal accuracy needed, but also the weight and the price of the device. Due to their building prop-erties, the sensors of uncooled TIR cameras offer a much lower spatial resolution (mostly 640 x 480 pixels, Table 2.4-1) than other cameras suitable to be carried onboard UAVs, although most expensive cameras can achieve a resolution of 1,280 x 1,024 pixels. The low resolution of the TIR sensors brings issues for TIR acquisition that has to be considered. The UAV flights must be planned in stable atmospheric conditions (e.g., no clouds, low wind speed) while the resolution of thermal image (depending on the flight altitude) must be fine enough to avoid an excess of mixed pixels at the border of objects of interest. Moreover, the thermal sensitivity, or noise-equivalent temperature difference expressed in milli-Kelvin, is another key parameter for choosing a TIR camera. Thermal sensitivity (or thermal resolution) measures for how well a TIR camera is able to distinguish between very small differences in thermal radiation within one image. Manufacturers also provide temperature accuracies for their TIR cameras that rep-resents the error made by the camera on temperature reading. Thermal accuracy usually rang-es between ±0.1 and ±5°C. However, numerous studies have shown that the accuracy of the TIR camera depends on the ambient conditions in which the shooting occurs. For instance, Kelly et al. (2019) revealed that the thermal accuracy of a Flir Vue Pro (radiometrically uncalibrated, FLIR Systems, Inc., Wilsonville, USA) varied from ±0.5°C when used under stable laboratory conditions (i.e., air temperature maintained constant at 20.6°C) to ±5°C when used in outdoor conditions for TIR UAV mapping (with flight conditions varying between partly cloudy to full sun, air temperatures between 20–28°C, and wind speeds up to 4 ms−1). Finally, some TIR cameras are factory-calibrated to generate non-uniformity compensation coefficients which are applied automatically by the camera in real time to maintain good image quality. These coefficients are based on pre-set ambient temperature, shooting distance, and take into account the TIR radiations emitted by the different parts of the camera itself (Olbrycht et al., 2012) (e.g., interior, lens).

164

Table 2.4-1: Examples of currently available thermal cameras to be carried onboard UAVs. * Combined with a 5-band sensor (R, G, B, red edge, near-infrared)

** Combined with a 4K RGB camera 2.4.2 Acquisition processes with TIR UAV

Before taking off with a TIR camera onboard a UAV, many particularities must be taken into ac-count in order to accurately retrieve both accurate or relative surface temperatures, such as TIR camera settings, flight planning, weather conditions, geometric and radiometric corrections, and orthomosaicking. Below, we present a step-by-step process to follow for obtaining high-res-olution spatially distributed correct temperatures that can be used to address environmental issues (Figure 2.4-3).

165

F igure 2.4-3: An example of a methodological fl owchart to acquire and process thermal-infrared images with UAV. Photograph of an uncooled TIR camera (InfraTec VarioCAM® HR 600) onboard a fl ying platform © CIRAD – E. Faye. All rights reserved. 2.4.2.1 Th e fl ying platform

Depending on the applications, diff erent types of UAVs can carry TIR cameras onboard (Watts et al., 2012). In most cases, TIR cameras will be embarked on UAV multi-copters that provide more stability and can hover. As always, fl ying an UAV is a trade-off between the payload, the battery capacity, the fl ight elevation and speed, the extent covered, and the desired image resolution. Flying with TIR cameras onboard will aff ect each of these parameters. Indeed, the fl ight time signifi cantly decreases as the payload increases. TIR camera made for UAV applications are usually connected, managed and thus fully interoperable with the UAV system and fl ight controller: control, tilting, and triggering are based on the Global Navigation Satellite System (GNSS) data of the platform. 2.4.2.2 Ground measurement devices

Th e thermal radiance emitted by the object and captured by the TIR camera is modifi ed by the qualitative and quantitative features of the atmosphere between the object and the sensor (Scher-rer & Koerner, 2010). Indeed, the atmosphere: (i) reduces the original signal (by absorption and scattering), and (ii) adds its own signal (related to the atmosphere temperature, its relative hu-midity and other components). Th is results in a change in the TIR readings by the camera as the shooting distance increases (Figure 2.4-4), even at very low distance (Faye et al., 2016b).

166

Figure 2.4-4: Eff ect of fl ight height on TIR readings

(adapted from Faye et al., 2016a with permission of Wiley).

In order to retrieve an accurate and absolute measurement of the surface temperature of the object of interest, various methods have been described in the literature: application of radiative transfer models (i.e., simulating atmospheric interference, Dubuisson et al. (2005)), empirical atmospheric radiance corrections using ambient temperature of a blackbody (linear or polyno-mial models, Torres-Rua (2017)), or neural networks (Ribeiro-Gomes et al., 2017). In practice, using an empirical calibration method based on known temperatures of thermal targets on the groundto apply the radiometric correction on the TIR images is the most commonly used meth-od (Kelly et al., 2019). Th is is the method presented below.

• Radiometric thermal targets

Th e radiometric correction detailed here is based on the knowledge of the absolute surface temperature of specifi c objects (radiometric thermal targets) during the airborne TIR image acquisition. Th e device is composed of four contrasted temperature targets (Lambertian sur-faces) that produce a large temperature range: two extreme temperature targets, that represent the hottest and coolest temperature of the area of interest, and two intermediate temperature targets. For example, targets can be made of: white polystyrene (cold), black-painted wood panel (hot), and dry and wet bare soil for intermediate targets (Figure 2.4-5). Th e temperatures of each target can be continuously measured during the UAV fl ight using thermo-radiometers

167

placed above the target (such as IR120, Campbell Scientifi c, ±0.2°C accuracy when calibrated against a blackbody) and recorded by a datalogger (Jolivot et al., 2017). Alternatively, tempera-ture of ground targets can be measured continuously using a thermocouple placed at the sur-face of the targets. Radiometric targets must be located within the cover zone of the UAV fl ight and be large enough to ensure they are easily detectable within the UAV images (i.e., several homogenous pixels in the TIR image depending on the resolution of the TIR camera used).

• Geometric thermal targets

Similar to RGB mapping, thermal ground control points (TGCPs) improve the georeferen-cing of TIR mapping products. TGCPs must be easily identifi able in the TIR bands. Indeed, not all objects appearing in an RGB image might be distinguishable in the TIR image. For example, two features of diff erent colours (diff erent signature in the visible spectrum) can have the same temperature and become not distinguishable in TIR images, and vice-versa (can you spot the cold stream in Figure 2.4-1C.2). It is therefore advisable to use high or low refl ective surfaces displaying a high thermal contrast with the surroundings (Figure 2.4-6). Diff erent materials and shapes are suitable to make easily recognisable TGCPs (e.g., cross, tri-

Figure 2.4-5: Th ermal targets for radiometric calibration. A.) Schematic drawing of the ground measurements device, B) and C) images of the device in RGB and TIR bands, respectively,

D) absolute temperatures recorded over diff erent radiometric targets during UAV fl ights

(adapted from Jolivot et al., 2017, originally published under a CC BY license https://creativecommons.org/licenses/by/4.0/).

168

angle, circle, square made of piece of metal, black painted panel, wooden board covered with an aluminium fi lm). Th e TGCPs size will depend on the spatial resolution of the TIR sensor used (usually fi ve to ten-fold larger to ensure their visibility).

Figure 2.4-6: Geometric thermal target to improve the georeferencing of a TIR map. RGB (A) and

TIR (B) images of a thermal target made of a black cross on a wooden board both contrasting with the surroundings in the TIR and RGB bands. ©IRBI – S. Pincebourde. All rights reserved. Because these patterns are easily identifi able in UAV images in the RGB and in the TIR bands, they can be used for both the RGB and TIR maps geolocation (see detail in Figure 2.4-7).

• Meteorological records and optimal fl ight conditions

In order to compare temperature between images or for thermal mapping, weather conditi-ons have to be as stable as possible. Th erefore, changes in weather conditions should be moni-tored to ensure their stability during TIR UAV image acquisition (Faye et al., 2016a). Indeed, changes in weather conditions can have a rapid and adverse impact on the object surface temperature (mainly wind gusts and variations of solar radiation due to cloud passes (Kelly et al., 2019)). Ideally, meteorological data should be recorded using a weather station located nearby the fl ight area and recording at a fi ne step-time. Th e main parameters to monitor are: air temperature and relative humidity, solar and atmospheric radiation, wind speed and direction. Accurate and synchronized time-keeping must be ensured across all devices (da-taloggers and TIR camera) and the timer of the GNSS receiver of the UAV. Th is monitoring allows to confi rm that weather conditions were stable during the fl ights (Figure 2.4-7); if not the acquisition fl ight should be performed again. Th erefore, TIR UAV fl ights and image ac-quisition have to be carried out during steady weather conditions, typically full sun periods, with no wind (i.e., gusts below 20 km/h), and no dust or smoke.

169

Figure 2.4-7: Solar radiation recorded during a day with clear sky conditions in the south of

France in August 2013. Th e red vertical lines indicate the time of fi ve successive UAV fl ights.

Th e small peaks below the blue curve are linked to little cloud passes. Unless otherwise stated, all images were prepared by the authors for this chapter.

Moreover, the optimum fl ight conditions depend on the object of interest and must be estab-lished with a thorough knowledge of the studied system. For example, the thermal contrast between the studied object and its surroundings must be maximal in order to optimize the de-tection of endothermic animals (e.g., fl ying during night to detect endothermic organisms will be more effi cient than during the day because this maximizes the diff erence in thermal radiance between the studied object and its environment). Not all objects have the same physical prop-erties related to the absorption and emission of thermal-infrared radiation (see 2.4.1.). For in-stance, some targets such as minerals (e.g., rocks) have higher thermal inertia than others such as vegetation. Rocks thus absorb the thermal-infrared radiation more slowly but also re-emit it more slowly than the vegetation, causing a higher (e.g., rocks) or lower (e.g., vegetation) lag contrast with the surrounding temperatures. 2.4.2.3 Camera settings and fl ights planning

First of all, in order to retrieve stable temperature data, the TIR camera needs to pre-heat before fl ying (i.e., stabilisation time). Ribeiro-Gomes et al., 2017 studied uncooled TIR cameras stabili-ty with a blackbody device and showed that, under laboratory conditions, at least 30 minutes of pre-heating are needed to obtain stabilized good quality data (Figure 2.4-8). Th en, the emissivity should be set to the emissivity value of the main studied object referring in emissivity table (Ru-bio, 1997) or by experimentally determining it (Zhang et al., 2016). If various objects of interest with a diff erent emissivity are studied, changes will be made in the post-processing (see 2.4.3). To avoid blurred image acquisition, the focus of the lens has to be set manually to the fl ight height. Last, as stated in 2.4.1., the ambient temperature has to be set as an input in TIR cameras that dispose of an internal calibration system (Table 2.4-1).

170

Figure 2.4-8: 30 minutes of temperature readings (Th ermotechnix Miricle camera) aft er switching on the TIR camera. Data acquired each 20 seconds on a blackbody set to 20 .

Th e jumps correspond to self-calibration events of the TIR camera.

Th e fl ight planning for TIR image acquisition is designed in a similar way as for RGB cameras (see chapter 1.5), but it is necessary to consider the lower resolution of the TIR sensor, the shut-ter speed, and triggering limitations of TIR cameras. Th erefore, the fl ying speed, the elevation, the frontal and side overlap have to be defi ned considering the TIR camera specifi cations. TIRground sampling distance that depends on the pixel size and fl ying height should be chosen to ensure object detection in the TIR bands. For example, Burke et al., 2019 target a minimum of ten pixels per object (i.e., animal) to ensure eff ective detection. Th is aspect must also be taken into account to limit the eff ects related to mixed pixels (see 2.4.5).

Th e fl ight plan should be designed to fl y over the radiometric calibration targets as many times as possible per fl ight (see 2.4.2). Th is will ensure to achieve a robust calibration relationship and therefore to improve the accuracy of the results. For instance, the UAV can make three passages over the radiometric thermal targets on each fl ight: at take-off , landing, and once in-between (Figure 2.4-9).

171

2.4.3 TIR images (pre-)processing

The use of low altitude TIR images acquired by UAV often cannot cover the whole area of interest. It is therefore needed to take a series of images, whereupon images have to be ra-diometrically corrected, and/or ortho-rectified and mosaicked to map the area of interest. Thermal image processing is a time-consuming step, which must be largely automated before infrared thermography can be applied as a routine tool in environmental practices. The de-sired temperature accuracy and spatial resolution must be chosen by taking into account the aim of the study. 2.4.3.1 Thermal radiometric corrections

The radiometric corrections can be performed by computing empirical linear equations each time the UAV captured the radiometric thermal targets, depending on flight plan (three times in our example, Figure 2.4-9). The average thermal TIR values for each target (four in the ex-ample given in Figure 2.4-5) must be calculated for each image acquired above the targets and then compared to the ground data recorded at exactly the same time by the ground device (Fig-ure 2.4-9). Then, the linear regression equations computed on these temperature data must be applied to the TIR images acquired the closest to the time of the UAV pass over the targets. In our example, the UAV passed three times over the targets and therefore three equations can be computed.

172

Figure 2.4-9: Radiometric calibration equation applied to retrieve calibrated temperature data. A. green marks on the fl ight plan represent a single TIR image acquisition and red marks represent

TIR images acquired over the thermal calibration targets. B. Example of a ground based TIR imagery radiometric calibration equation for one passage over the thermal targets.

Each red dot represents a ground target in Figure 2.4-5. Adapted by permission from Springer Nature Customer Service Centre GmbH: Springer, Precision Agriculture, Gómez-Candón, D.,

Virlet, N., Labbé, S., Jolivot, A. & Regnard, J. L.: Field phenotyping of water stress at tree scale by UAV-sensed imagery: new insights for thermal acquisition and calibration, © (2016).

Other authors (Salgadoe et al., 2019) made use of non-reference histogram methods for deter-mining average surface temperature, bypassing the procedure based on thermal references. 2.4.3.2 Orthomosaics and geometric corrections using TGCPs

Th e generation of TIR mapping products can follow the steps presented in chapter 2.2 for RGB orthoimage generation (see examples of a RGB and TIR orthomosaic in Figure 2.4-10). How-ever, the low spatial resolution of the TIR sensor and the low image quality in terms of contrast

173

and noise, leading to a low signal to noise ratio, make the mosaicking of TIR images less accurate compared to the use of RGB images. One way to improve the accuracy of TIR mapping is to process it in combination with RGB imagery. Th e RGB images are used to calculate a high-reso-lution digital elevation model onto which lower resolution TIR images are projected (e.g., Sledz et al., 2018 and Ribeiro-Gomes et al., 2017). Th is procedure makes it possible to obtain TIR orthomosaics of a much higher quality in term of geometrical correctness.

Figure 2.4-10: UAV-based RGB (A) and TIR (B) orthomosaics of a French apple orchard.

White and blue dots visible respectively in RGB and TIR orthoimages are the geometric thermal ground control points. 2.4.3.3 Correcting emissivity values for each object in a thermal map

If various objects of interest with diff erent emissivity values are to be studied on the same TIR map (e.g., in order to compare their accurate temperature under the same environmental condi-tions), Faye et al., 2016a developed a remote sensing procedure bringing together object recog-nition, masking and cropping objects, with emissivity assignment that can be used to extract ob-ject temperature with the appropriate emissivity value. However, by applying this process, Faye et al., 2016a assume that emissivity is spatially and temporally homogeneous for the same object (see Zhang et al., 2016 for details). Moreover, one should be aware that change in emissivity within a 5 % range will only slightly impact the fi nal temperature values (Clark, 1976). Th us, correcting emissivity values of diff erent objects in a TIR map should be made for retrieving

174

fine-scale discrepancies in absolute temperatures between objects or when studying objects with different values of emissivity (Faye et al., 2016a). 2.4.3.4 Surface temperature cross-validation

Once obtained, the accuracy of temperature in the TIR orthomosaics (radiometrically and emissivity corrected) can be checked. A simple method for temperature quality cross validation consists in comparing temperature values in the corrected TIR image with the temperatures acquired by thermo-radiometers located above artificial targets specifically built for this purpose or placed above natural surfaces already existing in the landscape. 2.4.4 Analyzing TIR images acquired by UAV 2.4.4.1 Relative or absolute surface temperatures

Choosing to work with absolute or relative temperature is an important decision when analysing thermal images. Absolute temperature analysis is justified when comparing image series (e.g., multi-temporal or multi-site) or when an accurate measurement of the temperature of an object is needed.

On the other hand, relative temperature retrieval is suitable to compare thermal data across space (e.g., temperature differences between objects) within the same image or for object detec-tion. Surface temperature excess (i.e., positive or negative deviation between pixel temperature values in the TIR images and ambient air temperature) is a relevant index for direct comparisons of object surfaces’ temperature captured under different conditions, regardless of their absolute temperature dissimilarities. But surface temperature excess is sensitive to radiative conditions, wind speed, and vapour pressure deficit (Maes & Steppe, 2012). Temporal comparisons of object responses to environmental conditions based on this index require that ambient conditions are controlled or remain mostly unchanged during experiments (Berger et al., 2010). 2.4.4.2 Image co-registration and data fusion

Coarse resolution of TIR images can be combined with the finer resolution of RGB image, in order to upscale the resolution of the thermal image. A procedure of data fusion, proposed by

175

Kustas et al. (2003) consists first in regressing thermal pixels of TIR image against an index issued from the high-resolution pixels of the RGB image, this method being feasible only when TIR and RGB images are properly overlaid. Secondly, on the basis of the regression, an estima-tion of temperature can be obtained for each subpixel, at the finer resolution. 2.4.4.3 Object-based analysis

A category of TIR image analysis focuses on object selection, delineation, and identification based on their spectral (i.e., thermal), shape, and contextual information. This type of analysis can be carried out by photointerpretation or by image processing. Interpretation is done in a similar way to traditional aerial photographs (i.e., in situ visual detection made by an observer on board an aircraft). It can also benefit from the motion dimension made possible by acquiring a video data available for some sensors. Recording moving objects in TIR video provides an ad-ditional detection feature that is the movement of the target. The detection of moving objects in the video helps the observer to limit the potential confusion between animals and objects (e.g., rocks, stumps). However, photointerpretation remains tedious to perform and is dependent on the observer.

Automatic image analysis allows standardization of the detection approach and the processing of large quantities of images. Whether pixel-based, or object-oriented performed by artificial intelligence-based (e.g., convolutional neural networks), these approaches provide increasing accuracies but remain demanding in terms of parameterization and data availability. Faye et al., 2016a present a workflow to quantify the thermal heterogeneity at the landscape scale based on RGB and TIR maps acquired from UAV and by applying spatial statistics to the TIR values of objects detected and classified using remote sensing technics on the RGB orthoimage. 2.4.4.4 Vegetation indices using the TIR band

The use of thermal image and the spatial variation of surface temperature as a proxy for plant transpiration rate and stomatal conductance is an efficient indicator of the plant water status, be-cause stomatal closure occurs before any other changes in plant water status (Jones, 1992). Thus, TIR imagery provide useful information to monitor plant water status and/or stress using veg-etation indices. High precision plant water status maps can be retrieved from remotely sensed TIR imagery through these stress indices, which are very useful tools for irrigation monitoring and plants trait responses to their environment, especially in areas where water resources are limited.

176

The crop water stress index (CWSI) is one of the most commonly used indices in crop water stress studies and irrigation scheduling applications (Idso et al., 1981). CWSI is a normalized index that was developed to overcome the influence that other environmental variables cause on the relationship between crop temperature and water stress. The empirical CWSI is calculated as: () () () ()=caca LLcacaULLLT TT T

CWSIT TT Twhere Tc-Ta is the measured difference between canopy and air temperature; (Tc-Ta)LL is the lower limit of (Tc-Ta) for a given vapor pressure deficit (VPD) which is equivalent to a canopy transpiring at the potential rate; and (Tc-Ta)UL is the maximum (Tc-Ta), which corresponds to a non-transpiring canopy.

Other commonly used vegetation indices based on TIR imagery are the water deficit index (WDI) (Moran et al., 1994) which is suitable for non-full-cover vegetation surfaces, the temper-ature-vegetation dryness index (TVDI) that assessess the land-surface dryness (Sandholt et al., 2002), and the vegetation health index (VHI) that combines thermal and multispectral data to monitor vegetation health, drought, and moisture (Choi et al., 2013). 2.4.5 Challenges and limits

By taking into account all the best-practices provided in this chapter to avoid the pitfalls ad-dressed, one should be able to appropriately record accurate thermal data with TIR cameras onboard UAVs in order to address various environmental and other issues. However, some chal-lenges still remain to face in UAVs-borne TIR imagery.

The radiometric correction of the TIR images based on simultaneous ground- and UAV-based thermal recordings with TIR cams is suited to provide accurate surface temperature measure-ments by taking into account the atmospheric component effects (e.g., distance, particles emis-sion, wind, …) and the potential bias of TIR sensors due to their temperature-dependency (Me-sas-Carrascosa et al., 2018). However, even when following this empirical calibration procedure, the resulting accuracies of either calibrated or uncalibrated TIR cameras achieved no less than a few degrees (Yon et al., 2008), a resolution that may not be sufficient for many applications such as ecophysiology or plant phenotyping. Indeed, the ground TIR data that is used for calibration is also affected by the same internal (and to a minor extent external) bias, leading to potential misestimates of absolute surface temperatures. The calibration of TIR sensors against blackbod-ies is an effective way to increase the accuracy of the TIR measurements (Torres-Rua, 2017), although these materials are not always available, and the procedure is time consuming.

177

TIR images acquired from UAVs provide instantaneous thermal information spatially distrib-uted, but its associated low-resolution results into difficulties like mixed pixels which affects the interpretation of TIR data (Jones & Sirault, 2014), particularly in heterogeneous surfaces such as plant canopies which do not fully cover the soil. Indeed, when several elements of the study area are included in the same TIR pixel (including part of the studied object), the resulting value of the pixel consists of a temperature mixture of these different elements. In order to avoid mis-interpretation of object temperatures due to mixed pixels, we advise to not consider at least two rows of TIR pixels at the border of the studied object and to fly at appropriate elevation to adapt the TIR image resolution at the size of the body object (see 2.4.2.).

Usually, the angle of view when capturing TIR images from a UAV is nadir. But similar to the shooting distance effect (Faye et al., 2016b), the shooting viewing angle is known to impact the TIR cameras readings (Clark, 1976). Thus, care should be taken when analysis temperature readings on TIR images taken with oblique viewing angle. Moreover, these effects might lead to inconsistencies in the TIR orthomosaicking process. As explained by Sledz et al., 2018, the blending step, which identify tie points from several images taken with different viewing angles using the pixels from all parts of the TIR image (including image vignetting effects), is critically hampered by the viewing angle effects, resulting in a lower quality in the TIR orthomosaic re-construction.

TIR cameras onboard UAVs can provide relatively high-resolution and spatially-resolved surface temperature measurements and, therefore, provide a powerful tool for environmen-tal sciences. But still UAV-TIR measurements provide no information on temperatures of be-neath-surface layers (i.e., under canopy, under rock or soil temperatures), which represent a major part of the thermal environment experienced by living organisms. Other thermal ap-proaches, such as proxidetection TIR imaging or punctual thermal recording, can then comple-ment the surface data.

References for further reading

180

2.5 Multi- and hyperspectral imaging

Sandra Lorenz, Robert Jackisch, René Booysen, Robert Zimmermann and Richard Gloaguen

2.5.1 The spectrum: physical background on the absorption and emission of light ................ 1812.5.2 The image: from spectral to hyperspectral ........................................................................... 1842.5.3 The pre-processing: origin and correction of data distortions ........................................... 1882.5.3.1 Geometric distortions ................................................................................................ 1882.5.3.2 Radiometric disturbances ......................................................................................... 1902.5.4 Post-processing and interpretation ........................................................................................ 1952.5.5 Discussion and outlook on current innovations .................................................................. 196Multi- and hyperspectral (MS and HS) imaging are currently deployed at a wide range of spatial dimensions (“scales”), ranging from satellites observing the Earth and other planets down to lab-scale sensing for small sample spectral analysis. New techniques such as UAV-borne imaging or terrestrial scanning of vertical targets are emerging and allow observing any target at a wide and contiguous range of scales.

Deploying spectral imaging on unmanned aerial platforms or drones creates one of the most promising application fields of spectral imaging in the last decade. Lightweight, low-cost, customizable, and usable by anyone and nearly anywhere, UAV close the scale gap be-tween airborne and ground-based spectroscopy and offer individual solutions for the re-spective application. Short turnaround times and a high variability and customizability of platforms and sensors enable targeted surveying of inaccessible or complex areas or objects of interest. Depending on flight altitude and deployed sensor, spatial sampling distances in the range of few centimeters can be reached while still offering a single image footprint of over one thousand square meters. With multi-image or push broom spectral imaging sur-veys a sufficiently large area can be covered within tens of minutes. Current developments in UAV technology aim to increase flight path automatization, object detection, and collision

181

avoidance, as well as system redundancy. Concurrently, the market for small and lightweight spectral sensors is growing fast. Sensors in the visible and near-infrared (VNIR) range of the electromagnetic spectrum are well-represented and distributed by a variety of companies, as they are based on common CCD technology, common optics, and require no additional cooling. The development and application of light-weight full short-wave infrared (SWIR) sensors (up to 2,500 nm) is more complex and still in early stages. Recently, a few companies were able to offer SWIR push-broom sensors with a mass below 7 kg. The rapidly ongoing miniaturization could allow lightweight multi- or even hyperspectral sensors in the mid-wave (MWIR) and long-wave (LWIR) infrared soon. Reflectance spectroscopy as a passive technique is currently the most common approach for drone-borne imaging spectroscopy. However, also active spectroscopic methods using fluorescence effects are increasingly re-searched for drone-borne applications.

Parallel to the technical development, the number of prospective users and application fields for drone-borne spectral imaging rises fast. One of the main fields of interest encompasses the wide range of vegetation analysis, such as precision farming, forestry, plant species and health monitoring as well as soil moisture detection (chapter 4.3, 4.4, 4.7). Important, but less applied fields are hydrology (chapter  4.3), geology (chapter  4.1), geomechanics, and environmental monitoring (ch 4.8).

Multispectral imaging is the currently most advanced and applied spectral imaging tech-nique for UAV-borne use. The well-progressed development of ready-to-use UAV-borne MS sensor systems allow straight-forward processing and the delivery of trustable and high-quality data products. Optimized routines comprise the required steps for radiometric and geomet-ric processing and have been implemented in established photogrammetric software (Agisoft Metashape, Pix4D) as easily-applicable and well-documented workflows.

In contrast, corrections on drone-borne HS data are applied rarely in recent publications and the data interpretation is often not exploiting the potential of the dataset. Whereas many basic applications such as the calculation of vegetation indices on flat terrain are still possible with poorly corrected data, more advanced problems, such as spectral end-member analysis or lithological mapping in hilly terrain, crucially rely on the scientific rigor of the corrected dataset. Geometric and radiometric disturbances are often not trivial to handle and differ greatly from the effects known from satellite or airborne data. The influence of the atmospheric spectral component at low flight altitudes is usually small, while the differences in illumination caused by microtopography need to be strongly con-sidered. So far, the novelty and diversity of UAV platforms and HS sensors have hindered the establishment of universal data processing routines as they exist for MS or satellite and airborne HS data. Respective future development of universal open-source workflows is

182

required to ensure that not only developers but all users of UAV-based HS imagery can obtain well-corrected data.

The following sections will give an insight into the principles of multi- and hyperspectral imaging that are required to understand the physical nature of spectroscopic processes as well as sensor-specific and external influence factors during the acquisition of spectral data. In later sections, the state of the art on drone-borne multi-and hyperspectral sensors, common and ap-plication-specific data correction and processing workflows are given to outline the remaining challenges. 2.5.1 The spectrum: physical background on the absorption and emission of light

Optical spectral analysis in general is the measurement of matter-light interactions as a func-tion of their energy. More specifically, this encompasses any radiation that is emitted, reflect-ed, or transmitted from the investigated target (Clark, 1999). The typical wavelength rang-es analysed in spectral imaging comprise VNIR, SWIR, MWIR, and LWIR, as depicted in Figure 2.5-1.

The concept of quantized molecular energy is key to the understanding of any absorption and emission processes observed in spectral imaging. It states that the possible quantum states of individual atomic species (atoms, ions, or molecules) are well-defined at a characteristic energy level. These states are characteristic of the particles’ physical nature and the dynamic and energetic processes affecting them. An atomic species possesses different sets of energy levels, associated with electronic, vibrational, rotational, and translational processes as well as electron spins. Besides a low energy or ground state, each set can feature several high energy or excited states. An excited state is reached when it absorbs an amount of energy matching the state’s energetic difference. Once excited, the transition back to a lower energy state usually happens spontaneously through the emission of energy with a frequency resembling the energy of the transition.

As the differences in energy level vary depending on the type of the associated process, ab-sorption and emission occur in different spectral ranges. Changes in rotational energy are ob-served in microwave down to UV range, vibrational processes are mainly expressed in the in-frared range, and electronic energy transitions are characteristic to the visible and UV range (Figure 2.5-1).

183

Figure 2.5-1: The electromagnetic spectrum: Important properties and relations for spectral imaging. (UV: Ultraviolet, FIR: Far Infrared). (Lorenz, 2019).

Unless otherwise stated, all images were prepared by the authors for this chapter.

An optically active center is usually affected by several processes, resulting in characteristic ab-sorption and emission features over the entire electromagnetic spectrum. In visible and infrared spectroscopy, observed absorption and emission effects mostly originate from atom or mole-cule vibrations and electronic transitions (Clark, 1999). Infrared-range photon energies are too small to excite electrons, instead atoms and groups in covalent bonds are excited to a range of vibration motions such as stretching and bending. Fundamental features at shorter wavelengths (4,000–1,450 cm-1 or 2.5–6.9 µm) are mostly broad and related to stretching vibrations of di-atomic regions (group frequency region), while signatures in the so-called fingerprint region (1,450–600 cm-1 or 6.9–16.7 µm) are a usually highly complex mixture of stretching and bending vibration effects. Weaker features occur at multiples of one fundamental absorption frequen-cy and additions of several fundamental absorption frequencies, referred to as overtones and combinations. The excitation of electronic transitions requires higher excitation energies than

184

thermal vibrations and can therefore be observed mainly in the visible, but also UV and SWIR range of the electromagnetic spectrum. The processes that electronic transitions are related to are manifold:

Crystal field effects are associated with unfilled or partially filled shells of transition elements (such as Fe, Ni, Cr, and Co) located in a crystal field. The influence of the field causes a splitting of the transition elements electronic states, and thus a shift of the transition energy. The splitting and resulting absorbed or emitted energies are highly dependent on the crystal structure and therefore characteristic for the host mineral.

Charge transfer absorptions occur when electrons are transferred between two metal ions (intervalence charge transfer, e.g. Fe2+-Fe3+, Fe2+-Ti4+) or between a cation and oxygen (oxy-gen-metal charge transfer, e.g. Fe-O, Cr-O). Charge transfer absorptions are usually located in the UV and lower VIS and are much stronger than crystal field effects.

Band gap electronic transitions occur in materials featuring an energetic gap between con-duction and valence band. Only electrons with energies exceeding the energetic gap between are absorbed, causing an absorption edge. At wavelengths above the edge and within the band gap, the material is theoretically transparent, whereas at lower wavelengths all incident radiation is absorbed. For silicates, the absorption edge is situated in the UV and the spectral signal in the VNIR remains unaffected. In sulfide minerals, the absorptions edge is located at much higher wavelengths, λ, from 350 nm for Sphalerite (ZnS) up to 3,350 nm for Galena (PbS).

Color centers are caused by the incidence of ionizing radiation or an imperfect crystal (Hunt, 1977). These imperfections may be lattice defects due to the presence of impurities (replaced ions), vacancies (missing ions), and interstitials (additional ions forced in between the lattice). The resulting modified ions and trapped electrons possess their own electronic states. Related absorptions appear as broad spectral features visible in the VNIR as a variety of distinct colors (e.g., the colors of irradiated apatite, topaz, or zircon).

Similar to vibrational processes, the energy absorbed by electronic processes in every case causes an excited energy state, from which the electron can relax. The respective spontaneous, discrete emission of light unrelated to thermal radiation is referred to as luminescence . Depend-ing on the process triggering the excitation, multiple types of luminescence are distinguished, such as chemi-, electro-, and photoluminescence, which often can be further subdivided. A com-mon approach to measuring meaningful luminescence spectra is the excitation with a strong, monochromatic excitation source such as a laser or LED (Light-Emitting Diode) under the total absence of other light. A pulsed light source offers the possibility of on-/off-measurements to retrieve a luminescence signal under ambient light.

Additional to the described effects, thermal radiation and grey body emission are common effects to every object or surface with a temperature above 0 k, resulting in a constant emission of infrared radiation due to the thermal motions of its charged particles. At an assumed thermo-

185

dynamic equilibrium, the emitted radiation behaves according to Planck’s law (Planck, 1914). Idealizing the emitter to a blackbody, which absorbs every incident radiation at all wavelengths and emits solely thermal radiation, the emitted wavelength- and temperature-specific radiation is simplifiable by Planck’s function (see Figure 2.5-1). With increasing temperature, the intensity of the emitted radiation of any matter rises, while the wavelength, at which the maximum radia-tion intensity is observed, decreases. The radiance spectra of incandescent light sources, such as the sun or lightbulbs, often have their intensity maximum in the VIS, where radiation is visible to the human eye. For matter at temperatures commonly experienced on the earth’s surface, the maximum radiation intensity is situated within the invisible infrared range of the electromag-netic spectrum (Figure 2.5-1). This results in the interference of the matter’s thermal radiation with additional polychromatic light and complicates the interpretation of the observed radiance signature. In the SWIR range, thermal radiation has only a minor influence and is therefore mostly neglected. The MWIR range is equally influenced by both sources, making its interpre-tation extremely complicated and often limiting its usage in spectral imaging. The LWIR range is largely dominated by thermal radiation, making it the common range for thermal analysis in remote sensing (chapter 2.4). 2.5.2 The image: from spectral to hyperspectral

Independent of the observed wavelength range, investigated material, and the underlying spec-troscopic processes, the format and visualization of any spectral dataset remains similar. All spectral datasets acquired by imaging spectroscopy in principle feature three dimensions with at least one value defining the measured signal intensity along at least two spatial and one spectral axis. Depending on the type of data this basic model can be reduced or extended to different levels of spectral and spatial complexity (Figure 2.5-2).

Standard digital cameras represent a very basic version of a spectral sensor, providing three broad and partly overlapping spectral channels centered at the “true color” wavelengths of blue, green, and red light: a close representation of the human eye vision. Most commonly, the data of all three channels is acquired at the same time and on the same sensor by covering the sensor with a particular pattern of color filters.

Multispectral sensors tune this concept for spectral analysis, not only by extending the ob-servable spectral range towards the near infrared, but also by discretizing the single channels. As the overall number of channels is still very low (usually between four and ten), a similar sensor concept as for RGB cameras can be used. While such snapshot sensors allow acquiring data without temporal and with negligible spatial offset, the increased number of channels drastically reduces the achieved spatial resolution. For this reason, many popular off-the-shelf MS camer-

186

as such as the Parrot Sequoia+ or the MicaSense RedEdge-MX operate as a multi-camera (or camera rig), i.e. all channels are provided by separate cameras in a fixed rig, which during image acquisition are triggered simultaneously. Once the offset between the cameras is determined, all channels can be treated similarly during further processing. The use of a single sensor together with a fast-rotating filter wheel provides another alternative without spatial and with a negligi-ble temporal offset; however, the increased number of moving parts also increases the size and fragility of the sensor. Common to all MS sensors is a small size, weight and price, quick acqui-sition time, and a limited number of channels compared to HS sensors. The number and center wavelength of the latter are usually set according to the application. Most multispectral sensors are used for vegetation monitoring and such provide around four to six spectral channels in the green, red, and near-infrared part of the electromagnetic spectrum, tailored for the calculation of plant-specific spectral indices (chapters 4.4, 4.7, 4.8).

Figure 2.5-2: Schematic examples on different levels of dimensionality of spectral data with x, y, z being the spatial, λ the spectral, and t the temporal axes (Lorenz, 2019).

In contrast to the few broadbands provided by multi-spectral data, a hyperspectral image is defined as a three-dimensional data-cube with a large number of spectrally narrow, quasi-UAVi-

187

contiguous entries along the spectral axis. This provides the possibility to query a plottable spec-tral signature for each spatial position on a surface (Figure 2.5-2, plot 3). The accompanying amount of information results in much larger data sizes compared to polychromatic or mul-tispectral imagery. The acquisition of an HS dataset in a reasonable time is thus more complicat-ed. The multi-camera approach is not feasible for HS data, as it would mean to acquire each band by a separate sensor. In theory, snapshot sensors enable the contemporaneous acquisition of one dataset at a time, but are still rarely used as this is often achieved by a decrease of either spectral or spatial resolution or signal-to-noise-ratio (SNR). Common HS sensors, therefore, reduce the amount of simultaneously acquired data by sequential scanning of, e.g., one spatial pixel at a time (whisk broom or across-track scanning), one spatial pixel line at a time (push broom or line scanning), or one spectral channel at a time (frame-based imaging).

These approaches require either moving parts within the device or a movement of the whole sensor to acquire a complete data-cube. Due to the time offset between the individual record-ings, additional movements of the sensor platform lead to image distortions and trigger the need for additional data pre-processing steps. Related to their acquisition principles, whisk and push broom scans are dominantly in need of spatial alignments between the acquired pixels or lines, while frame-based images may feature spatial offsets between spectral channels. Examples of current commercial HS sensors suitable for UAV-borne use are given in Table 2.5-1.

Fast sensors that are less prone to generate spatial distortion effects, such as snapshot or frame-based sensors, can be deployed on smaller, but also less stable UAV. Whisk and push broom scanners often provide spectrally higher quality data, but in general require more stable platforms with a higher payload to carry additional equipment for geometrical calibration, such as a global positioning system (GNSS) and inertial measurement unit (IMU). Based on current legislation in most countries, UAV systems up to 25 kg MTOW (maximum take-off weight) have relatively easy permitting (chapter 1.4). Redundancy is an additional weight, but a critical safety factor. Thus, all systems (GPS, IMU, …) should be designed fully redundant. Multifrequency GNSS receivers and INS systems, that allow for either RTK or PPK computation of the flight trajectory, are also critical in that sense (chapter 2.1).

Currently, the majority of civil (lightweight) UAV used for environmental sciences can be grouped into two categories; fixed-winged systems and multi-copters (chapter 1.3). In general, fixed-winged drones have decreased payload compared to multicopters. Fixed-wing drones that are capable to carry heavier payloads usually have only confined space for the payload and need more space for take-off and landing. Further limitations originate from the relation of flight speed vs altitude. For fixed-wing UAVs, a speed of around 15–25 m/s in the medium (air) is nec-essary to provide a stable flight. Depending on the wind conditions, this may result in a ground speed of more than 30 m/s, however, due to regulations, the maximum altitude is usually limited to 100–120 m above ground (chapter 1.4). To achieve qualitative data, the deployed sensor needs

188

to operate at very high frame rates, which at the same time limits the maximum integration time and therefore also SNR in the images. True color RBG and multispectral sensors, which are usually light and less affected by fast movements, are the preferred payload of fixed-wing UAV. In particular, for higher altitude flights with longer range (BVLOS) and higher coverage requirements, the fixed-wing solution is most attractive. Hyperspectral sensors are typically heavier than multispectral sensors and require stop-and-go or slow speed acquisition to achieve sufficient SNR data. Multicopters (or multi-rotor platforms) are the most preferred choice, as they allow respective low-speed acquistion. However, these platforms are usually characterized by significant vibrations and a high-frequency rolling and pitching to achieve a levelled flight. Sensor-mounting on a high-quality gimbal is therefore an important asset to achieve sufficiently stable acquisition conditions.

Table 2.5-1: Examples on current multi- and hyperspectral sensors for drone-borne use. A more exhaustive list can be found in Adão et al., 2017 or Aasen et al.,

2018, FWHM =Full Width at Half Maximum.

Geometric distortions encompass any effects that influence the spatial correctness of an image or dataset. Spatial quality in this case is achieved if the spatial projection of any information de-livered by the image/dataset matches its real location within a reference surface/space. The defi-nition of the reference system is artificial, but allows to set different datasets into a spatial context and to describe the location of any image feature with unequivocal and universal coordinates (chapter 2.1). The compensation of any geometric distortion in conjunction with the geolocation of the dataset into a reference system is called orthorectification.

The origins of geometric disturbances in spectral image data are manifold and directly related to the imaging principle:

Sensor-specific, internal, or optical distortions occur due to the technical design and mechanical imperfections of the sensor itself. Common examples are one- and two-dimen-sional barrel (fish-eye) distortions or curvature effects at the slit of line-scanners due to dif-fraction. By careful determination of the individual device-specific distortion coefficients (radial and tangential) and internal camera parameters (focal length, skew and center co-ordinates), the distortions can be determined by calibration routines and finally removed from the dataset.

The main external distortions originate from the viewing angle of the sensor, may it be stable during the acquisition of one or all datasets in a survey, or variable due to random and systematic movements of the platform. Stable off-nadir viewing angles can usually be corrected by perspective un-distortion of the image. Stable velocity of the sensor or platform can be used to calculate the appropriate aspect ratio of the resulting pixels. However, changing velocities, as they are particularly common in multi-rotor platforms (also refer to chapter 1.3), are much harder to correct and require a logging during the acquisition for a satisfactory correction. The required parameters comprise any variability in sensor or platform movement , such as pitch, roll, yaw, skew, and changes in position and altitude (Figure 2.5-3). For whisk broom, push broom, and frame-based sensors this results in distortions between each acquired pixel, line, or spectral band, respectively (Figure 2.5-3). For snapshot sensors or at extreme movements, an additional blurring of the image may occur. The most common correction approach is the logging of the three-dimensional location, time, and axial acceleration (= angular position) using a GPS and an IMU attached or near to the sensor during the entire survey. A post-pro-cessing accuracy (RMS error) in the range of sub-decimeter for positioning (x,y,z) and better

190

than 0.1° for roll/pitch/heading is recommended (e.g., Applanix AP-15UAV or similar). Sever-al measurement INS (inertial navigation system) types (e.g., MEMS, fiber optic systems) and (multi-)antenna GNSS setups are established (chapter 1.3). However, these devices differ sig-nificantly in price and weight. Sufficient accuracy levels are commonly reached by integrating a multifrequency GNSS receiver and a MEMS inertial components unit. After careful boresight alignment, i.e. the correction of angular misalignment between the measurement axes of the single sensors, the recorded information can be used to separately orthorectify each distorted part of the image. For fast movements, such as for very small platforms, the approach does not apply, either because additional devices are not allowed by the limited payload of the platform or due to the limited accuracy and synchronicity of position, orientation, and HSI measure-ments. The cost factor also plays an important role for small surveys. For these reasons, alter-native strategies need to be developed (chapter 2.1). Topography can have a strong negative influence on the spatial correctness of a dataset (chapter 2.2). The amount of distortion is high-ly dependent on the topographical height differences within the scene as well as the altitude of the sensor. In particular, with strong topographic effects, the number of required control points for an accurate orthorectification is manually hardly achievable. Alternative correction approaches encompass (1) automatic keypoint detection, matching, and respective warping of the dataset to an orthophoto with similar or higher spatial resolution, or (2) projection of the image on a high-resolution digital elevation model (DEM) using sensor position, angles, and altitude, as well as image-specific parameters such as field-of-view (FOV). While approach (1) is independent of the exact knowledge of all acquisition parameters, approach (2) is robust to low information content or quality of the dataset (e.g., off-shore imaging, extremely noisy, or cloud-covered images).

A combination of several external distortions – such as expressive topography in conjunction with low acquisition altitudes, strong sensor movements, or high platform velocity – can com-plicate the distortion correction distinctly. For this reason, the use of a gyro-stabilized sensor or gimbal is highly advised for drone-borne data, as they can help to reduce pitch and roll angular movement of the sensor during the acquisition, which eases the correction of the remaining effects.

191

Figure 2.5-3: Schematic illustration of common geometric distortions due to sensor or platform movement, left : push broom scanning, right: frame-based imaging. (A) Characteristic movements of an unstabilized travelling aerial platform. (B) Resulting line-wise distorted image of a push broom HSI. C) Characteristic movements of a hovering drone-borne gyro-stabilized platform. (D) Resulting band-wise distorted image of a frame-based HSI. Th e landscape background of both top fi gures was created with Google maps satellite imagery (Lorenz, 2019). 2.5 .3.2 Radiometric disturbances

Radiometric eff ects disturb the spectroscopic information within the dataset and comprise glob-al, spatially-local, and/or spectrally-local deviations in the pixel values. Similar to geometric eff ects, their origin may be internal (sensor-related) or external (environment-related). A da-taset corrected for any internal radiometric eff ects is usually referred to as at-sensor radiance.

192

Correction for any external illumination effects results in TOA (top of atmosphere) reflectance, an additional atmospheric correction finally retrieves surface reflectance.

While irradiance, E , defines how much radiometric flux is received by a surface per unit area and is given in W·m−2 (or W·m−3 at wavelength dependency), the radiance, L , indicates how much radiometric flux is received or released from a surface per unit area and unit solid viewing angle. It is given in W·sr−1·m−2 (or W·sr−1·m−3 at wavelength dependency), and is in contrast to irradiance independent from the distance to the illumination source. Reflectance, R , as a ratio between the incident and reflected radiation, is unitless and usually given either in percent or as a factor between zero and one.

Important examples of internal radiometric disturbances comprise dark current, bad pixels, vignetting, smile, and keystone effects (Barreto et al., 2019).

Dark current refers to the signal received by a photodetector in the absence of any incident external light. The measured electrons are generated due to the non-zero temperature of the sensor, leading to defects in the semiconductor band structure and a random noise pattern, especially in low-signal images. This noise consists of a hardly correctable random or shot noise part and a rather fixed temperature- and pixel-specific pattern, which can be corrected by subtraction from the dataset. As dark current is a thermal effect, sensor cooling is highly advised to achieve stable and low-noise imagery, especially for measurements in the IR range of the spectrum.

Dead, stuck, and hot pixels (often summarized as “bad pixels”) are sensor pixels that fail to return a meaningful signal, instead, they provide permanently minimal (dead) or maximal (stuck) intensity or show anomalous values after sensor heating (hot). In the acquired image data, these pixels appear as definite one-dimensional lines along a spatial or spectral axis with zero, infinite, or anomalous values. Even if their information content is irrevocably lost, they can be eliminated by interpolation  – for example from the spectrally and spatially closest image value (Kieffer, 1996) – to avoid a further disturbance of the dataset in subsequent pro-cessing.

Similar to commercial RGB cameras, spectral sensors utilizing a lens may be subject to vi- gnetting , i.e. a radial loss in intensity towards the image edges. A correction requires knowledge on the optical pathway, and can be achieved by data-driven cross-track illumination correction or the application of a pixel- and wavelength-specific gain and offset matrix. The latter is used to correct for device-specific deviations in sensitivity between the pixels of the sensor array in general.

In push-broom imaging systems, optical aberrations and misalignments of the sensor can lead to a concurrent spatially and spectrally curved distortion, known as smile (or frown) and keystone effects . In this context, smile refers to a shift of the center wavelength, keystone to

193

a band-to-band-misregistration (Yokoya et al., 2010). Both effects are usually corrected using sensor-specific calibration values.

Depending on the acquisition circumstances, numerous external radiometric effects can in-fluence the measured signal (Figure 2.5-4). The radiance of the illumination source defines the maximal achievable radiance (full reflection). For drone-borne measurements, illuminating ir- radiance is usually a mixture between direct solar irradiance and diffuse sky irradiance resulting from the scattering of sunlight in the atmosphere. Changes in irradiance intensity or spectral shape during one or between several surveys result in global differences of measured at-sensor radiance, either within one or between several datasets. Depending on the sensor-target dis-tance, different compensation approaches exist.

Figure 2.5-4: Paths of radiance and external radiometric disturbances in a HS field acquisition (based on the concept of Jensen, 2007; first published in Lorenz, 2019).

For low altitude drone-borne data, reference targets with known reflectance spectra and ori-entation similar to the observed surface can be used to determine the current downwelling ir-radiance. The targets should have a known, ideally featureless spectra and a constant diffuse reflectance within the measured wavelength range. Well suited are white or grey polyvinyl chlo-ride (PVC) plates in VNIR, high-purity polytetrafluoroethylene (PTFE or teflon) in the SWIR,

194

and brushed aluminum or coarse high-purity gold in the LWIR. If the reference targets are not visible within each acquired image or scan line, an on-board irradiance (“sunlight”) sensor can log irradiance intensity (and spectra) for each scene for later compensation (Gilliot et al., 2018; Hakala et al., 2018). In comparison to the reference-target-based approach, an irradiance sensor allows the correction of potential irradiance variations occurring during the acquisition process. If no such sensor is available, a data-driven bundle-block adjustment can be used to estimate and correct for overall illumination differences between overlapping images (Honkavaara et al., 2012). At higher altitudes, the pixel footprint eventually becomes too large to allow the usage of reference targets. However, the downwelling irradiance within one acquisition should be rather constant and, on a clear and sunny day, can be estimated according to the current date and time, sensor-target distance, and assumed atmospheric composition.

The reflected signal on a specific surface is dependent on a range of parameters and its behav-iour can be described by the Bidirectional Reflectance Distribution Function (BRDF, Nico-demus et al., 1977). It is defined as the ratio f r between the differential scattered radiance, dL r

, in direction of the observing sensor and the differential incident irradiance, dE i

, with: ()() () () ()

(1)Here, λ shows the dependency of BRDF on the wavelength in spectral measurements. The terms ( θ i

, ϕ i

) and ( θ r

, ϕ r

) describe the azimuth and declination of irradiance and reflection, respectively. It can be seen that the incident irradiance, dE i

, is represented by the radiance, L i

, which is inci-dent under the solid angle, i

, onto a surface. Hereby an incidence angle, θ i

, off the surface nor-mal leads to a radiated surface area which is by 1/cos θ i larger than at a normal angle incidence. By that, the radiation intensity is reduced by the factor cos θ i

. As a result, surfaces illuminated at an angle far from the surface normal appear darker than such with a near-normal illumination. Materials with a BRDF dependent on both ϕ i and ϕ r show an additional variation in radiance when the azimuth of the illumination is changed (the material is rotated). Such surfaces are referred to as anisotropic, in contrast to isotropic materials. Additionally, the BRDF is dividable in two main components, i.e. specular and diffuse reflection, and can be influenced not only by direct but also by concurrent ambient illumination.

An exhaustive experimental determination of BRDF is seldomly reasonable due to its high dimensionality as well as material and texture dependency. For surveys carried out on flat topog-raphy and with sufficient overlap between individual images (60–80 %), the BRDF effect can be suppressed by incorporating only the very central image parts into the final mosaic or by using overlapping image regions to estimate the required parameters. In more complex scenarios, em-pirical and theoretical models can be used to approximate the material-specific effect of BRDF

195

(Cook & Torrance, 1981; Lambert, 1760; Schlick, 1994). In remote sensing, the assumption of a Lambertian behaviour is common, which represents isotropic diffuse reflection (Civco, 1989; Teillet et al., 1982). At image acquisition with large pixels, such as high-altitude drone-borne data, and areas with low topography this approach usually retrieves satisfactory results. Such, across-track brightness gradients in the imagery of sensors with a wide view angle can be cor-rected (Cross-Track Illumination Correction (Kennedy et al., 1997)). However, in rugged terrain as well as over anisotropic surfaces such as forest or meadows, the Lambertian assumption can lead to strong overcorrection, especially at off-nadir viewing angles or at illumination at an angle far from the surface normal. A range of empirical non-Lambertian illumination/topographic correction methods has been developed, such as c-factor (Teillet et al., 1982), Sun-Canopy-Sen-sor (SCS (Gu & Gillespie, 1998)), or Minnaert (1941). The determined wavelength-specific em-pirical coefficients are retrieved by regression of pixel brightness and illumination angle. Despite the distinctly improved result for rugged terrain, these approaches lack performance in areas with high material variability, as in theory each material with different BRDF would require the calculation of a separate empirical coefficient. Data pre-classification and separate correction would be required to achieve a sufficient regression error.

None of the approaches can sufficiently correct for shadows yet. Usually, the affected pix-els are determined using illumination angle (core shadow) and surrounding topography (cast shadow) and are masked out after. Compensation for shadows is practically almost impossible. Firstly, the signal intensity from shadowed areas commonly falls within the background noise level of the sensor and fails to contain any valuable information. Secondly, the retrieved signal is a specific mixture of reflection from different sources of diffuse irradiance (sky, trees, and neighbouring topography). In illuminated pixels, the contribution of these sources is generally low enough to barely interfere with the received signal. In shadowed pixels, they are the only light source and their single proportions of contribution are very specific and hardly estimable for each pixel.

Despite interactions on the surface of the target, every radiation path in the system is influ-enced by the atmosphere . Despite reflection and scattering at atmospheric particles that weaken the signal and produce diffuse sky irradiance, all traveling photons are subject to absorption by atmospheric gases and dust. Depending on the crossed thickness and composition of the atmosphere the intensity and spectral shape of the atmospheric disturbances vary. For low flight altitudes, e.g. below a hundred meters, the influence of the atmosphere on the downwelling and reflected light is usually corrected using several reference ground targets (Empirical Line Calibration – ELC (Smith & Milton, 1999)). For higher altitudes, atmospheric compensation by physical modelling is common. Several algorithms exist to estimate the atmosphere’s spectral contribution, usually combined with topographic illumination correction (e.g., ATCOR (Rich-ter & Schläpfer, 2018), FLAASH (Cooley et al., 2002)). Such tools usually utilize lookup tables

196

based on calculated radiative transfer models such as MODTRAN (Berk et al., 2014) or 6SV (Vermote et al., 1997). Several input parameters are required such as time, date, altitude, and lo-cation of the measurement, weather conditions, and a high-resolution digital elevation model. It has been shown that these tools are also applicable for low-altitude UAV acquisitions (Schläpfer et al., 2018).

The signal finally arriving at the sensor is composed not only of the radiance of the target (including all described disturbances), but also the path radiance of light scattered in the at-mosphere without reaching the ground as well as light from surrounding surfaces scattered into the field of observation ( adjacency radiance ). Every surface with a temperature above 0 K additionally emits thermal radiation , which interferes with the reflected signal. At common temperatures, this affects mainly the LWIR part of the electromagnetic spectrum. Only very hot surfaces (over several hundred degrees Celsius) such as lava flows can influence VNIR and SWIR measurements (chapter 2.4, 4.6). 2.5.4 Post-processing and interpretation

Spectral imaging data can be interpreted by two major approaches, either by directly analysing the physical spectroscopic properties of materials or by using a more mathematical approach of classifying the data according to extractable patterns or data features using machine learning techniques (chapter 3.2).

Spectral analysis in principle relies on the spatial mapping of spectroscopic properties such as specific absorption or emission features. The focus of the analysis can be set on single features or reach up to full material-specific spectral patterns. This results also in different analysis ap-proaches. Single features or spectral characteristics are often mapped according to spectroscopic knowledge, e.g. using simple band ratioing approaches that map the ratio of reflectance values at specific, manually set wavelengths. This approach is mainly used in multispectral data, where a full analysis of a spectral feature is not possible. It can provide simple abundance maps of spectrally active materials, such as plants or iron minerals. For HS data, comparable, but more informative approaches exist, that allow to map the accurate width, depth and position of a spe-cific spectral feature to draw conclusions for example on the material composition (van der Meer et al., 2018). Spectral mapping or the analysis of full material-specific spectral patterns is much more complex. Usually, available spectral validation data from field measurements, extracted image spectra or official spectral libraries (Kokaly et al., 2017) are used as reference spectral signature. This reference can then be compared to the observed image spectra to determine the spectral similarity and such, the probability of occurrence of the mapped material per pixel. Different spectral mapping approaches exist (Harris, 2006). These approaches perform best if

197

the analysed pixel covers a target of homogeneous composition, which makes the spectral sig-natures directly comparable. For natural targets or low spatial resolution of the sensor, mixed spectra are a common effect, i.e. that the observed spectrum is actually composed of a mixture of contributing spectral “endmembers”. Image processing techniques exist that allow an extraction or “ unmixing ” of these spectral components to retrieve and map information on the target’s composition (Bioucas-Dias et al., 2012) as well as the abundance of the different components per pixel. This approach is particularly interesting for geological targets that represent a hetero-geneous mixture of minerals with highly variable spectral composition.

The techniques of classification or domain mapping allow a different approach of data anal-ysis based on the categorization of data pixels according to data-specific criteria. Machine learn-ing and artificial intelligence plays a major role in the development of algorithms for classifi-cation and the related fields of segmentation and feature extraction (Ghamisi et al., 2017). The resulting classification maps allow a clear discrimination between different domains that can be composed of mixed materials and do not necessarily need to be characterized by one specific spectral signature (e.g. lithological units, plant communities). 2.5.5 Discussion and outlook on current innovations

Multispectral cameras represent the currently most used spectral imaging systems for UAV-based acquisition. In particular for applications in vegetation (crop, forestry) monitoring a range of ready-to-use systems exist on a both scientific and commercial basis, including not only the camera itself but also optimized platforms and dedicated processing routines for data correc-tion, processing, and interpretation. Their ruggedness and endurance make MS cameras a good choice for simple mapping tasks or when harsh imaging conditions (rough terrain, long flight times) are expected. Their low-number of spectrally fixed channels constrain possible applica-tions mostly to the initial design aim. However, this also allows designing cheap and reliable systems, which can deliver the exact mapping products that are required by a potential customer, thus making MS drone systems ideal for industrial use. Interchangeable filters and/or an exten-sion of the observable spectral range to the SWIR and LWIR could increase the flexibility of MS cameras and broaden their applicational spectrum.

Hyperspectral cameras , with their capability to capture hundreds of image bands, offer the detection and mapping of plants, soil, and rock, as well as respective mineral types. The maturity of UAV-based HSI is constantly advancing and with it the level of quality to characterize unique properties of ecosystems, including topographical and physical aspects, surface composition, and vegetation at once (Arroyo-Mora et al., 2019). UAV-based HS data require specific steps of pre-processing and correction which differ from aircraft and satellite platforms. The novelty of

198

the approach, but also the diversity and customizability of UAV platforms and deployable sensors impeded the establishment of correction workflows available for any user. As a result, most pub-lished UAV-borne datasets are only partly or not corrected for platform-specific radiometric and geometric effects. In particular, for targets with highly variably morphology this can cause distinct distortions within the dataset (Jakob et al., 2017). While it is still possible to retrieve meaningful information from poorly corrected data using simple two-band-ratios, detailed spectral analysis of usually narrow absorption features is not possible. Geological applications in particular build on reliable spectral information, as the spectral differences between mineralogical domains are usually subtle. The establishment of versatile and comprehensive processing workflows will be key to ensure quality standards within UAV-based HSI. The Mineral Exploration Python Hyper-spectral Toolbox (MEPHySTo) (Jakob et al., 2017) was one of the first to combine the essential tools for a full processing workflow of UAV-based HSI, and could provide a basis for implement-ing today’s broad range of advanced processing algorithms (Aasen et al., 2018).

Multi-sensor UAV approaches are used increasingly across many scientific disciplines, with the environmental and agricultural sciences as one of the very early contributors, e.g. in com-bining spectral and elevation information for segmentation and plant species detection. The generation of quantitative multi-sensor spatial mapping products in precision farming with low-cost UAVs became more feasible in the early 2000s (Berni et al., 2009). Recent studies show the potential of using a UAV equipped with exchangeable sensors to investigate plant communities and ecosystem properties, such as peatlands (Beyer et al., 2019). The integration of multispec-tral, high-resolution RGB, and thermal data in combination with photogrammetry and image classifiers driven by machine-learning makes it possible to derive high-precision plant param-eters. Since the introduction of user-friendly UAV handling, software and processing routines, numerous applications were developed and tested. Future applications are likely to expand from earth sciences and engineering towards human social interaction (Xiang et al., 2019).

The next step in the evolution of UAV-based HSI could be real-time or quasi-UAVi-real-time on-board data processing and evaluation. Such systems would provide a direct link and feed-back of the acquired data to the end-user on the ground. The fusion of state-of-the-art hardware to capture and process the resulting heavy data streams is challenging but possible (Horstrand et al., 2019). It is important to note that most HS sensor systems are currently transported by multi-rotor UAVs, which are limited in flight time, and therefore spatial coverage. For UAV to become a complete substitute to manned airplane imaging systems, currently only fixed-wing or VTOL (Vertical Take-Off and Landing) platforms hold the endurance and flexibility to compete.

A parallel development in sensor technology regarding miniaturization and innovative sensor concepts will allow using a much wider range of spectral imaging sensors for drone-borne use in the future. As a major advantage, it will allow increasing the value of drone-borne spectral surveys for new application fields. It additionally fuels the diversity of available sensors

199

on the market and promotes the development of increasingly low-cost and user-friendly prod-ucts. In the VNIR, a large assortment of competing products already allows making an applica-tion-based selection according to the required spectral range, sensitivity, resolution, or budget. The number of available sensors in the SWIR is far more limited as sensor cooling is obligatory to reach a sufficient SNR, which complicates the device design, limits possible miniaturization, and increases the price distinctly. Still, innovative concepts are already under development and promise much lighter and cheaper sensors for the future (Goldstein et al., 2018). A similar devel-opment is ongoing for LWIR range HS sensors. Appropriate lightweight drone-borne versions are not yet commercially available but have been announced recently (Boubanga Tombet et al., 2019) and will increase the application portfolio of drone-borne surveys in the future. Mineral mapping campaigns could benefit from such sensors in particular, as it would allow increasing the amount of detectable rock-forming minerals (chapter 4.1).

Drone-borne spectral sensors of any kind are used for reflectance measurements only in most of the cases, using the irradiance of the sun. However, approaches to measuring luminescence signal from a drone-platform have started (Burud, 2019; Duan et al., 2019). For the moment, however, only point measurements are performed in a drone-based setup, but the concept could be extended to a mapping approach soon. The luminescence signal could be used alone or com-plementary to reflectance measurements to characterize and monitor vegetation, hydrocarbons, man-made structures, or valuable raw materials (Lorenz et al., 2019) (chapter 2.7).

References for further reading

200

2.6 UAV laser scanning

Gottfried Mandlburger

2.6.1 LiDAR principles ...................................................................................................................... 2022.6.1.1 Laser ranging .............................................................................................................. 2022.6.1.2 Scanning ...................................................................................................................... 2022.6.1.3 Laser beam model ...................................................................................................... 2052.6.1.4 Signal detection and waveform processing ............................................................. 2062.6.1.5 Geometric sensor model ........................................................................................... 2072.6.1.6 Radiometric sensor model ........................................................................................ 2092.6.1.7 Flight planning ........................................................................................................... 2102.6.1.8 Quality control and sensor orientation ................................................................... 212

2.6.2 UAV-LiDAR sensor concepts ................................................................................................. 2132.6.2.1 Sensor overview .......................................................................................................... 2132.6.2.2 Topo-bathymetric sensors ......................................................................................... 2152.6.2.3 Sensor integration examples ..................................................................................... 215Laser scanning, in general, is a method for obtaining 3D information of the environment. UAV (Unmanned Aerial Vehicle) laser scanning, in particular, delivers dense 3D point clouds of the Earth’s surface and objects thereon like buildings, infrastructure, and vegetation. In contrast to conventional airborne laser scanning (ALS), where the sensor is typically mounted on manned aircraft, UAV laser scanning (ULS) utilizes Unmanned Aerial Systems (UAS) as measurement plat-forms, which allow lower flying altitudes and velocities compared to manned platforms resulting in higher point densities and, thus, a more detailed description of the captured surfaces and features.

Whereas the benefit of ALS is large-area acquisition of topographic data, with the Digital Ter-rain Model (DTM) being the prime product, ULS can be thought of as close-range ALS enabling applications, which require high spatial resolution. However, both ALS and ULS are similar in the fundamental aspects of operation.

ULS is a dynamic kinematic data acquisition method. The laser beams are continuously sweep-ing in lateral direction and together with the forward motion of the platform, a swath of the

201

terrain below the UAV is captured. The distances from the sensor in the air and targets on the ground are determined by measuring the time difference between the outgoing laser pulse and the portion of the signal scattered back from the illuminated targets into the receiver’s Field of View (FoV). This is commonly referred to as the Time-of-Flight (ToF) measurement principle. As laser scanning in general, ULS is therefore a sequentially measuring, active acquisition technique.

To obtain 3D coordinates of an object in a georeferenced coordinate system (e.g. WGS84), the position and attitude of the platform and the scan angle need to be measured continuously in addition to the ranges. Thus, ULS is a dynamic, multi-sensor system, where each laser ray has its own absolute orientation. In contrast to aerial photogrammetry, where the image orientation can be established by bundle block adjustment based on ground control points, ULS mainly relies on direct georeferencing. The use of a navigation device consisting of a GNSS (Global Nav-igation Satellite System) receiver and an IMU (Inertial Measurement Unit) are indispensable.

ULS is a polar measurement system, i.e., a singe measurement is sufficient to obtain the 3D coordinates of an object. This is of special advantage in case of dynamic objects like tree cano-pies which are permanently moving due to wind. For image-based techniques, this is a relevant problem because in the UAV-context, the image ground sampling distances (GSD) are typically in the cm range, and small object movements lead to displacements of multiple pixels. (a) Echo number(b) ReflectanceLegend: 1st2nd3rdecho -15 -12 -9 -6 -3 0 dB

Figure 2.6-1: 3D UAV-LiDAR point cloud of a forest plot; (a) colored by echo number: 1 st echoes

(blue) accumulate in the canopy whereas 2 nd and 3 rd echoes dominate on the ground, (b) colored by reflectance: small twigs and branches feature lower reflectance (blue) compared to laser returns from understorey (green), and from stems and bare ground (orange).

Unless otherwise stated, all images were prepared by the author for this chapter.

202

The ideal laser ray is infinitely small, but the actual laser beams can rather be thought of as light cones with a narrow opening angle. In ULS, the typical diameter of the illuminated spot on the ground (footprint) is in the cm- to dm-range depending on the flying altitude and the sensor’s beam divergence. Due to the finite footprint, multiple objects along the laser line-of-sight can potentially be illuminated by a single pulse. In such a situation, ToF sensors can return multiple points for a single laser pulse. This so-called multi-target capability together with high meas-urement rates leads to unprecedented 3D point densities for the acquisition of semi-transparent objects like forest vegetation (cf. Figure 2.6-1).

Next to signal runtime, ULS sensors typically deliver additional attributes for each de-tected echo. Especially if the entire incoming radiation is sampled and stored with high frequency (full waveform recording), object properties like reflectance can be derived via radiometric calibration of the signal (cf. Figure 2.6-1b). The received signal strength strong-ly depends on the employed laser wavelengths, which range from the visible green to near infrared part of the spectrum. Green laser radiation (λ=532  nm) is capable of penetrat-ing water and is therefore used in laser bathymetry for capturing the bottom of clear and shallow water bodies. Infrared wavelengths (λ=905/1,064/1,550 nm), in turn, exhibit better reflectance characteristics for vegetation, soil, sealed surfaces, etc. Thus, infrared lasers are the first choice for topographic mapping and forestry applications. This is equally relevant for both ALS and ULS.

Another similarity between ALS and ULS is data acquisition with partially overlapping flight strips. The overlap area provides the basis for (i) checking the strip fitting accuracy and (ii) geometric calibration of the sensor system via strip adjustment. In contrast to area-wide data capturing, ULS is particularly well suited for corridor mapping (river courses, forest transects, fault lines, etc.). While manually piloting the UAV is restricted to visual line-of-sight (VLOS) operation, regular scan grid patterns are usually realized via waypoints, which potentially enable beyond line-of-sight (BVLOS) given the respective permission.

The remainder of the chapter is structured as follows: Chapter 2.6.1 details the fundamentals of laser ranging and scanning, the sensor geometric and radiometric model, and the principles of flight planning, quality control, and sensor orientation via strip adjustment. Chapter 2.6.2 gives an overview of available UAV-LiDAR topographic and bathymetric sensors and their in-tegration on different UAV platforms and discusses the pros and cons of the individual sensor systems together with their targeted field of application. The chapter concludes with a list of related references.

203

2.6.1 LiDAR principles

In this Section, the principles of ALS scanning in general and of UAV-based laser scanning in particular are shortly summarized (Pfeifer et al., 2015; Shan & Toth, 2018; Vosselman & Maas, 2010). 2.6.1.1 Laser ranging

The core component of each laser scanning system is the ranging unit. Knowing the speed of light c , the pulse emission time t 0

, and the arrival time of the return pulse t 1

, the sensor-to-target distance R can be calculated as: ()=10 t tRc 1If (i) the laser beam hits an extended planar target under a normal incidence angle and (ii) the speed of light (group velocity) is accurately determined, the ranging accuracy is directly related to the timing error. To achieve a ranging accuracy in the cm range, sub-nanosecond time meas-urement accuracy is required.

Ranging based on the ToF principle is employed for manned ALS as well as for most ULS systems. However, the phase-shift method constitutes an alternative ranging approach. In this case, a continuous laser signal is imprinted onto a carrier wave and the offset between the phase of the emitted and returned signal is measured. The main advantage of the ToF principle is its inherent multi-target capability. This is particularly useful for environmental studies, especially when scanning semi-transparent objects like forests, where the laser light is able to penetrate the vegetation through small openings in the foliage. The phase-shift technique, in contrast, only delivers a single return per pulse. 2.6.1.2 Scanning

As in traditional ALS, sampling of the Earth’s surface with UAV-based laser scanning is ac-complished based on flight strips. Areal coverage with 3D points requires (i) the forward motion of the UAV platform and (ii) a beam deflection unit systematically steering the laser rays below or around the sensor. Figure 2.6-2 shows typical beam deflection mechanisms used in ULS.

204

vv

LRFv

LRFvvvvv

Figure 2.6-2: Mechanical beam deflection strategies used in UAV-based laser scanning.

Assuming both horizontal terrain and horizontal forward motion of the platform with constant velocity, a rotating multi-faced polygonal wheel produces parallel scan lines on the ground ap-proximately perpendicular to the flight trajectory. The constant rotation of a mirror polygon yields an approximately constant point distance along the scan line within a typical FoV of ±30° around the nadir direction. By adjusting rotation speed (scan rate), flying velocity, and pulse repetition rate (PRR), a homogeneous point pattern on the ground can be achieved, both along and across track (Figure 2.6-2a).

Panoramic scanning in vertical scan planes is achieved using a scan wedge, where the mirror plane is tilted by 45° with respect to a horizontal rotational axis. As the laser scanner is typically mounted below the UAV, the full circle of laser beams is restricted in practice to approximately 230°. This still allows scanning even above the horizon, which is beneficial in the context of environmental mapping, e.g., to acquire narrow canyons or riverside vegetation. Concerning the homogeneity of the point pattern, the same as for polygonal wheels applies for the na-dir area (±30°). Due to panoramic scanning, the swath is much wider and is only limited by the maximum measurement range of the sensor. The point spacing decreases with increasing distance from the strip center and with larger ranges the size of the laser footprints increases (Figure 2.6-2b).

In contrast to scanning in vertical planes as described above, oblique scanning with a constant laser beam off-nadir angle results in a spiral-shaped scan pattern on the ground. This is, e.g., implemented by employing a rotating scan wedge with a tilted rotational axis (Palmer scanner, Figure 2.6-2c). Palmer scanners are especially used in laser bathymetry with off-nadir angles

205

between 15–20°, as this is the optimum trade-off for receiving reflections from the water surface as well as for penetration of the laser signal into the water column (Guenther et al., 2000). For topographic applications, oblique scanning enables to look under bridges and potentially pro-vides more returns from facades, depending on building height, road width, and laser beam tilt. It also provides a forward and backward look in the same scan line (more precisely: scan circle or ellipse), thus hitting objects from different viewpoints. However, this double-look feature dimin-ishes from the center towards the border of the strip. A downside of this scanning mechanism is the inhomogeneous point distribution with a much higher density on the border compared to the center of the strip, which needs to be appropriately considered during data processing.

Oscillating mirrors constitute an alternative to constantly rotating mirrors or polygons (Fig-ure 2.6-2d). The mirror constantly swings between two positions. The extreme mirror positions mark the border of the strip. Due to the necessary deceleration at the end of the swing, the point density is higher at the border of the strip compared to the center, as it is the case for Palmer scanners.

A completely different scanning approach is persued by a technique referred to as solid-state hybrid lidar (Frost et al., 2016) or rotating multi-beam LiDAR, respectively. In this context, sol-id-state means that no rotating or oscillating device is used to deflect the laser beam in different directions, but the entire laser unit spins around an axis. As this technology often operates a fan of laser range finders (8/16/32/64/128 channels) in parallel, the term profile-array scanner is used in the following. In ULS, scanner integrations with horizontal rotation axes are preferred enabling panoramic scanning similar to Figure 2.6-2b, but with multiple laser channels and, hence, multiple scan lines per revolution. This potentially increases the capturing rate by a factor of n, with n=number of laser channels. Figure 2.6-3 illustrates the general principle. v

Laser range finder

Figure 2.6-3: Scanning principle of profile-array laser scanners.

In all strategies shown so far, an individual detector receives the backscattered signal from a single narrow laser shot. In contrast to that, so-called flash LiDAR or focal plane LiDAR sensors use a broad laser pulse and the backscattered signal illuminates an array of receivers. These sys-tems are also termed ToF cameras or range cameras, as the result of a single laser pulse is a range image (Hansard et al., 2012). Thus, no scanning in the above sense using rotating elements is

206

required to obtain arial coverage, which enables extremely compact and lightweight design (~100 g). Due to the limited measurement range on natural targets (< 50 m), flash LiDAR is not further considered here but with ongoing development it is likely to become an option in UAV-based LiDAR mapping for environmental applications in the future. 2.6.1.3 Laser beam model

While the ideal laser shot is infinitely short and narrow, in practice typical UAV LiDAR sensors exhibit a laser pulse duration in the range of 1–6 ns corresponding to 30–180 cm in metric units and feature a laser beam divergence of around 0.5–3 mrad resulting in a laser footprint on the ground of 2.5–15 cm for a flying altitude of 50 m above ground level (AGL).

The energy distribution in longitudinal and radial direction (along and across the laser beam direction) is commonly described as a Gaussian function (Jutzi & Stilla, 2005; Słota, 2015). Fig-ure 2.6-4 shows a conceptual drawing of the energy distribution within a laser beam and the corresponding mathematical formulation is provided in eq. 2. r t

Figure 2.6-4: Leaser beam model (adapted from (Słota, 2015)). ( )+

I t rI e2 I 0 denotes the peak energy level which is reached at temporal position t = 0 and radial posi-tion r = 0, i.e. along the laser axis in the middle between rising and falling of the laser energy.

207

I 0 decreases exponentially from this center point both along and across the laser line of sight.

The drop level depends on the standard deviation of the Gaussian curves (longitudinal: σ tang

, radial: σ rad

). In signal processing, pulse duration and size are often described by the so-called “full width at half maximum” (FWHM), i.e. the range when the signal has dropped to the half of its maximum. The following relations between FWHM and standard deviation apply for the longitudinal and radial direction:

=22 2 ln2 tangw3

==22 2 ln2 radsR4The pulse duration w (eq. 3) directly influences the range discrimination distance, i.e. the capa-bilitity to separate two consecutive objects illuminated by the same laser beam along the beam path (e.g., two branches of a tree, shrub and ground below, etc.). As a rule of thumb, the min-imum time, or distance, respectively, to separate two individual laser echoes is dt = w/2 . For a typical pulse duration of 3 ns, the range discrimination distance in metric units is approximately 45 cm.

From eq. 4 it can be seen that the size of the illuminated area s (i.e., the laser footprint diame-ter) depends on both the measurement range R and the beam divergence γ . The size of the laser footprint inherently limits the spatial resolution of any LiDAR system. As ALS and ULS sensors exhibit comparable beam divergence measures, the spatial resolution of ULS is higher by an order of magnitude due to shorter measurement ranges. 2.6.1.4 Signal detection and waveform processing

In conventional ToF laser ranging, the return signal of a highly collimated laser pulse is received by a single detector. For the conversion of the optical power into digital radiometric informa-tion, a two-stage procedure is employed (Ullrich & Pfennigbauer, 2016). First, an Avalanche Photo Diode (APD) converts the received laser radiation into an analog signal, and subsequently an Analog-Digital Converter (ADC) generates the final measurement in digital form. APDs used for UAV-based laser scanning operate in linear-mode, i.e. the dynamic range of the APD where the optical power and the analog output are linearly related. Such APDs deliver measures of the received signal strength and provide object reflectance and/or material properties of the illuminated objects via radiometric calibration (Briese et al., 2012; Wagner, 2010).

The actual range detection is either implemented by hardware components of the laser scan-ner (discrete echo systems) or by high-frequency discretization of the entire backscattered echo

208

waveform. In the latter case, the captured waveforms are either processed online by the firmware of the sensor (Pfennigbauer et al., 2014) or stored for detailed analysis in postprocessing (Mallet & Bretar, 2009; Shan & Toth, 2018). To date, some existing ULS sensors feature full waveform acquisition with entailed advantages w.r.t. ranging precision, target separability, and object char-acterization (amplitude, echo width, reflectance, etc.). A detailed discussion of full waveform laser scanning is beyond the scope of this book. More information is found in subject literature (Jutzi & Stilla, 2005; Mallet & Bretar, 2009; Wagner et al., 2006). 2.6.1.5 Geometric sensor model x s y s z s x i y i z s R n i x i y i z s R n i R n i

INSscanner

GNSSlever armmisalignment

WGS84

(geocenter) y e x e z e z e

Figure 2.6-5: Conceptual drawing of ALS/ULS sensor model based on: (Glira et al., 2015b).

UAV laser scanning is a kinematic measurement process based on a tightly synchronized mul-ti-sensor system consisting of a Global Navigation Satellite System (GNSS) receiver, an Inertial Navigation System (INS), and the laser scanner itself. INS sensors are also termed IMU (Inertial Measurement Unit). The computation of georeferenced 3D points is called direct georeferencing and is illustrated in Figure 2.6-5.

In a preprocessing step, Kalman filtering (Grewal et al., 2013) is employed to merge GNSS and IMU observations resulting in a so-called Smoothed Best Estimate of Trajectory (SBET). A Kal-man filter integrates the individual positional and inertial measurements over time in a linear quadratic estimation framework considering statistical noise and other sources of inaccuracies. It delivers the absolute 3D positions (X, Y, Z) of the measurement platform in a geocentric,

209

Cartesian (Earth-Centered-Earth-Fixed, ECEF) coordinate frame as well as the attitude of the measurement platform w.r.t. to the local horizon (navigation angles: roll, pitch, yaw). GNSS typically provides positions with a rate of 1–2 Hz corresponding to a point distance of 4–8 m for a typical UAV flight velocity of 16 knots (~8 m/s). The INS measurement rate in turn, is much higher (100–500 Hz) and both 3D positions and attitudes are estimated for each timestamp (t) of the higher IMU-frequency within the Kalman filter resulting in a typical point spacing of consecutive flight trajectory points of 1.6–8 cm.

The trajectory data are subsequently combined with the time-stamped laser scanner meas-urements. In general, the raw range and scan angle measurements are not directly provided by the sensor manufacturers, as small corrections are applied to the raw data compensating systematic instrument effects which are calibrated in the manufacturer’s lab (irregularities of the scan mirrors, amplitude dependency of range measurement, etc.). This internal calibration leads to 3D coordinates of the detected objects (i.e. laser echoes) in the sensor coordinate system and constitute the basis for the calculation of 3D object coordinates in an ECEF coordinate system according to eq. 5 ( )( )( ) ( )( )()=++eeenii s nisx tg tR t R t a R x t5The transformation chain in eq. 5 transforms between the following coordinate systems (CS), each denoted by a specific index and highlighted by a specific color in Figure 2.6-5.

• s/blue: scanner CS

• i/red: INS CS, also referred to as body CS or platform CS

• n/no color: navigation CS (local horizon: x=north, y=east, z=nadir)

• e/magenta: ECEF (earth-centered earth-fixed) CS

Reading eq. 5 from right to left, x s( x s , y s , z s ) is a 3D vector denoting the coordinates of a laser point in the local scanner CS which is rotated by the boresight angles into the INS system (isR) and shifted by the lever arm ( a i ). The lever arm is the offset vector between the phase center of the GNSS antenna and origin of scanner system, and the boresight angles denote the small an-gular differences (Δroll, Δpitch, Δyaw) between the reference plane of the scanner and the INS (cf. green elements in Figure 2.6-5). While the lever arm can be measured on the ground with a total station, the boresight angles are determined within strip adjustment based on data from a calibration flight (Hebel & Stilla, 2012; Skaloud & Lichti, 2006). niR transforms the resulting vector from the INS CS to the navigation system based on the IMU measurements (roll/pitch/yaw), and enR rotates to the cartesian ECEF system. The latter rotation depends on the geograph-

210

ical position (latitude/longitude) of the INS origin. The 3D coordinates of the laser point x e ( t ) are finally obtained by adding the ECEF coordinates of the GNSS antenna ( g e ).

The total positional and vertical uncertainty (TPU/TVU) of ULS-derived 3D points depends on the accuracy of both the laser scanner and the trajectory as well as on the synchronisation of all sensor components (GNSS, IMU, scanner). Compared to ALS based on manned aircraft, the accuracy demand for the angular components (scan angle, platform attitude) is lower for ULS due to the shorter measurement ranges. For this reason, the employed INS sensors are typically less accurate (roll/pitch: ~0.015°, yaw: 0.035°) as used for ALS. GNSS errors, however, directly translate to respective errors in the ULS point clouds, thus, equally accurate GNSS receivers are required for ALS and ULS. 2.6.1.6 Radiometric sensor model

Information about the radiometric properties of illuminated objects are of high importance for environmental applications. The laser-radar equation describes the fundamental relationship between the emitted and the received optical power (Pfeifer et al., 2015; Wagner et al., 2006): ( )2 22

/ 4

4/ 4

=+6The received power P R depends on the transmitted power P T

, the measurement range R , the laser beam divergence γ , the size of the receiver aperture D , the radar cross-section σ , as well as factors related to system losses η SYS and atmospheric attenuation η ATM

. P BK

, finally, indicates solar background radiation that deteriorates the signal-to-noise ratio.

The laser-radar cross-section σ incorporates all target properties and can be separated into the illuminated target area A , the object’s reflectance ρ , and the backscattering solid angle Ω .

4A=7 Ω denotes the opening angle of a cone into which the laser signal is reflected. Specular reflection is characterized by a narrow cone (i.e. small values of Ω ). Most of the natural targets (soil, grass, trees, etc.) as well as sealed surfaces (asphalt, concrete) are diffuse scatterers. For ideal diffusely reflecting targets ( Ω = 180o), Lambert’s cosine law is applicable.

The cross-section further depends on the illuminated area A , which is a function of the meas-urement range R , the beam opening angle γ , and the incidence angle α between the laser beam

211

and the normal direction of the illuminated surface. For extended targets larger than the laser footprint, the area calculates to (Roncat et al., 2016; Roncat et al., 2012): ( )= cos4 cosL

A L is the projection of the effectively illuminated target area to a plane orthogonal to the laser beam direction which only depends on the measurement range R and the laser beam opening angle γ . Inserting eqs. 8 and 7 into eq. 6 reveals a decrease of received power with the squared sensor-to-target distance ( R 2). Linear targets (e.g. power lines) crossing the laser footprint, in turn, exhibit a R 3 relationship and the signal loss corresponds to R 4 for point features (e.g. leaves).

LiDAR sensors do not directly measure the received optical power P R

, but especially full wave-form laser scanning provides the signal amplitude and the width of the return echo which to-gether are proxies for P R

. Simple correction strategies account for the dominating range effect to correct the received signal strength measurements (Höfle & Pfeifer, 2007), while rigorous radi-ometric calibration use external radiometric reference measurements to obtain object properties like backscattering cross-section, backscattering coefficient, or object reflectance (Briese et al., 2012; Kaasalainen et al., 2011; Kashani et al., 2015; Wagner, 2010) .

The laser-radar equation only applies in the far field starting at a range of about 50 m. While UAV flying altitudes are seldomly lower than 50 m above ground level due to safety consider-ations, still objects standing out from the ground (e.g. buildings, trees, power line towers, etc.) may well result in measurement ranges smaller than 50 m. In this case, sensor manufacturers often provide look-up tables describing the relation between signal strength and short measure-ment range. 2.6.1.7 Flight planning

ULS data capturing is generally carried out based on individual flight strips. For areal data acquisition a setup with longitudinal strips for areal coverage and occasional cross strips for block stabilization (cf. Figure 2.6-6a) constitutes best practice. Adjacent flight strips typically exhibit an overlap area of 20–50 %. Corridor mapping often requires a more flexible flight plan with the strips aligned to the object of study (cf. Figure 2.6-6b). Also in this case, sufficient overlap of consecutive strips is crucial for enabling proper quality control and stabilizing the block geometry.

212

Figure 2.6-6: Flight strip setup for areal survey (a) and corridor mapping (b). Th e areal setup consists of six longitudinal strips and three cross strips. In the corridor setup, the fl ight strips follow the river course and strip overlaps are provided at the junction points for block stabilization.

Th e most relevant parameters for planning a fl ight are (i) the swath width of the individual strips and (ii) the intended laser pulse density:

Th e swath width SW (eq. 9) relates to the fl ying altitude h and the scanner’s FOV . Th is applies to scanners with a fi nite FOV. For panoramic 360° scanners, SW is only restricted by the max-imum range. Th e mean pulse density PD (eq. 10) is directly proportional to the eff ective meas-urement range MR and indirectly proportional to the swath width and the fl ying velocity  v . For Palmer scanners and scanners with an oscillating mirror, MR corresponds to the PRR , while in most other cases MR<PRR. Th e latter is typically the case for scanners with rotating polygons or 360° scanners. For the prior, eq. 10 denotes the mean pulse density as an average of very high density at the strip boundary and more representative lower pulse density in the middle of the strip.

213

2.6.1.8 Quality control and sensor orientation

Within the common area of two flight strips, the ground surface and objects thereon are meas-ured independently from different viewing points. Deviations in the overlap area are an indica-tor of the sensor calibration quality. Especially smooth and inclined surfaces (slopes, embank-ments, roofs, etc.) are well suited to detect potential sensor calibration problems. Deviations can either be measured as strip height differences based on strip-wise, gridded Digital Elevation Models (Ressl et al., 2008) or based on the 3D point clouds by calculating the distances of points in one strip from the planes constructed from the neighbouring points of the overlapping strip (point-to-plane distances). If (i) the residual errors are larger than the nominal accuracy of the employed sensors or (ii) systematic errors occur, re-calibration of the sensor system and orien-tation of the flight strips becomes necessary.

Depending on data availability, either approximative methods (Ressl et al., 2011) or rigorous approaches (Glira et al., 2019; Glira et al., 2015a; Pfeifer et al., 2015; Skaloud & Lichti, 2006) can be employed. Approximative methods typically start with the geo-referenced 3D point cloud and try to minimize the (height) deviations in the strip overlap area. Rigorous approaches, in turn, are based on the geometric sensor model (cf. chapter 2.6.1.5), and utilize the raw measure-ments (i.e., flight trajectory and the coordinates of the laser echoes in the sensor’s CS) to estimate the sensor calibration parameters. The most important parameters are (Glira et al., 2016):

• mounting calibration (lever arm and boresight angles)

• scanner calibration parameters (range and scan angle offset and scale)

• trajectory correction parameters (constant offsets, drifts, time dependent correction terms of higher order)

• datum shift parameters

Sensor calibration and strip adjustment of ULS does not generally differ from the strategies applied for manned ALS-platforms. UAVs, however, are not as stable as manned aircraft and hence prone to sudden fluctuation in position and attitude, especially when flown in windy conditions. Keeping in mind that the highest-class INS sensors are seldomly used for ULS, local deviations of the 3D point clouds of overlapping flight strips are likely. To compensate these short-term effects, a cubic spline-based trajectory correction is proposed by Glira et al. (2016) for UAV-flights with sufficient strip overlap and control patches. An optimum sensor calibration and orientation strategy would incorporate the raw GNSS and INS measurements, but this is still subject to scientific research (Cucci et al., 2017). Glira et al. (2019) extended the concept of pure laser scanning strip adjustment to hybrid sensor orientation including camera sensors. Including correspondences between laser strips and image tie points in a comprehen-

214

sive integrated adjustment framework has proven to improve the trajectory estimation (Glira, 2018). 2.6.2 UAV-LiDAR sensor concepts 2.6.2.1 Sensor overview

Table 2.6-1 provides an overview of existing compact laser scanners suited or even designed for integration on UAVs. The specifications are taken from company brochures. In case of different operation modes, the reported values always denote the mode with the highest measurement rate. The maximum range depends on the object’s reflectivity, where at least ρ≥60 % is assumed. The listed precision (prec) and accuracy (acc) numbers relate to the ranging component only. For better readability, the beam divergence measures are also expressed as footprint diameters on the ground assuming a flying height of 50 m AGL.

Table 2.6-1: UAV-LiDAR sensor specifications.

(1)(2)(3)(4)

(5)(6)(7)(8)

Figure 2.6-7: Selected UAV-lidar sensors. 8 Images from Riegl, Velodyne Lidar and Teledyne Geospatial; used with permission – all rights reserved.

In general, different categories of LiDAR sensors are available. Extremely lightweight sensors (< 1 kg) enable longer flight endurance but are typically less accurate (3 cm) and exhibit a larg-er footprint diameter in the dm-range. Such sensors can be integrated on small UAS (sUAS) platforms with a maximum take-off mass (MTOM) < 10 kg. Sensors delivering survey-grade precision in the cm range typically weigh around 4 kg and, thus, require larger UAVs with a MTOM of around 25 kg.

The sensors listed in Table 2.6-1 use the scan mechanisms shown in Figure 2.6-2 and Fig-ure 2.6-3. All sensors except (5) and (6) are conventional linear-mode LiDAR systems with a single laser channel and mechanical beam deflection with a rotating polygonal wheel (3), rotat-ing wedge (1, 2, 8), oscillating mirror (7), or conical scanning (4). Sensors (5) and (6) are pro-file-array scanners with 32 or 128 jointly rotating laser channels. Most of the cited sensors use panoramic scanning (FoV=360°) with (near) horizontal rotation axes allowing to capture verti-cal structures to both sides of the scanner in narrow valleys, street canyons, and river corridors. It is a clear advantage of agile UAVs to operate in such demanding scenarios. The down side of this scanning mechanism is that, in most cases, the surfaces and objects of interest are located beneath the UAV. Hence, concentrating the emitted laser pulses to a smaller FoV would increase the effective measurement rate, as this is the case for scanners (3), (4) and (7).

The measurement rates ranging from around 200 kHz (2, 4, 5) to more than 1 MHz (3, 6) result in point densities on the ground in the order of 50–500 points/m2 depending on flying al-titude, flight velocity, and FoV. ULS is therefore well suited for deriving Digital Elevation Models with a grid spacing of 5–10 cm (Escobar Villanueva et al., 2019; Mandlburger et al., 2015). It is noted that the spatial resolution generally depends on both the point spacing and footprint size and is always limited by the larger of the two. Thus, when choosing the right scanner for a certain application, both aspects need to be taken into account. 2.6.2.2 Topo-bathymetric sensors

Most sensors listed in Table 1 are topographic scanners based on infrared wavelengths. Sensor (4) is a topo-bathymetric scanner employing a laser operating in the visible green domain of the spectrum (532 nm). At this wavelength, laser light is able to penetrate the water column and measure the ground of the water bodies. While airborne laser bathymetry based on manned aircraft is well suited for mapping clear and shallow coastal areas and larger inland water bodies, the spatial resolution is moderate as relatively broad laser beams are employed to ensure eye-safe operation. The main advantages of UAV-borne LiDAR bathymetry are (i) the potentially higher planimetric resolution and (ii) the agility of the UAV platforms. The latter make UAV-based bathymetry an upcoming technique for mapping smaller water bodies like ponds and medi-um-sized rivers featuring a meandering course.

The depth performance of topo-bathymetric sensors is often defined in multiples of the Secchi depth (SD). Secchi depth is an empirical measure for water turbidity and denotes the distance where the black and white quadrants of a 30 cm checker board disk lowered into the water can no longer be separated. Sensor (4) constitutes a survey-grade topo-bathymetric sensor featuring a maximum depth penetration of 2 SD. However, the sensor requires a powerful UAV platform with a MTOM of around 35 kg. Complementary to sensor (4), more lightweight instruments (5 kg) with a depth penetration of around 1 SD are available too. Such sensors are suitable for capturing small and very shallow clear water rivers. A comprehensive review of existing topo-ba-thymetric sensors can be found in (Mandlburger et al., 2020).

217

Figure 2.6-8 shows examples for integrations of LiDAR sensors on various UAV platforms. The choice of the appropriate type of UAV depends on the payload capacity and the targeted flight endurance. In general, multicopter, helicopter, and fixed wing UAVs are potentially suited for UAV-LiDAR integrations, but multicopters (quad-, hexa-, and octocopters) are most often uti-lized.rotorsGNSS+ radio data link antennasbatterieslaser scanner (1) + IMUcameraslaser scanner (4)camera or

IR laser octocopter UAV

(MTOM: 25 kg)octocopter UAV

(MTOM: 35 kg)hexacopter UAV

(MTOM: 15.5 kg)profile array laser scanners (5) + IMU

GNSS antennasSource: 4D-IT, www.4d-it.com

Fixed wing UAV (MTOM: 15.5 kg)Source: Quantum systems, https://www.quantum-systems.com; YellowScan, https://www.yellowscan-lidar.com/profile arraylaser scanner (5)

(a)(b)

(c)(d)cameras

Figure 2.6-8: UAV-LiDAR sensor integration examples. 9 Images from 4D-IT and

Quantum Systems; used with permission – all rights reserved.

Figure 2.6-8a depicts a full-featured hybrid sensor system consisting of a panoramic laser scan-ner (1), two oblique cameras, and a navigation system (GNSS+IMU) integrated on an octocop-ter-UAV with a MTOM of 25 kg. The maximum flight time of this survey-grade sensor system is around 30 min and the targeted application include agriculture, forestry, archaeology, corridor mapping, monitoring of landslides and open-cast mines, and urban mapping.

Figure 2.6-8b shows a bigger version of the same UAV-type (octocopter, MTOM: 35 kg) carry-ing a topo-bathymetric sensor (4). The inertial navigation sensors and an optional camera or IR laser range finder are combined and tightly coupled in a compact housing. The targeted applica-tions for UAV-based LiDAR bathymetry include flood modelling, habitat mapping, monitoring of morphodynamics, roughness estimation for hydrodynamic-numerical modelling, etc.

9Source: (c) 4DIT. (d) Quantum systems, YellowScan.

218

The system shown Figure 2.6-8c comprises two ultra-lightweight profile-array laser scanners (5), two oblique looking RGB cameras and a navigation system (GNSS+IMU) mounted on a hexacopter UAV. The system features a maximum flight time of 40 min and is used for terrain modelling, vegetation mapping, geological mapping, documentation of pit-mining activities, or land slide monitoring, and for high-quality 3D documentation of building facilities, industrial sites, or archaeology.

The final example depicted in Figure 2.6-8d shows the same scanner as above (5) integrated on a fixed wing UAV featuring only limited payload capacity (700 g) but long flight endurance. In general, fixed wing UAVs outperform multicopters in terms of flight endurance, which is es-pecially useful when operated BVLOS. The depicted system features a flight time of 90 min, and offers telemetry capabilities within a radius of 7 km. Together with the generally faster cruising speeds compared to multicopter platforms, fixed wing UAVs are best suited for large-area map-ping with very high spatial resolution. This is especially beneficial for capturing vast uninhabited forest areas, for which BVLOS flight permissions are easier to obtain compared to populated areas.

References for further reading

Ranging based on the ToF principle is employed for manned ALS as well as for most ULS systems. However, the phase-shift method constitutes an alternative ranging approach. In this case, a continuous laser signal is imprinted onto a carrier wave and the offset between the phase of the emitted and returned signal is measured. The main advantage of the ToF principle is its inherent multi-target capability. This is particularly useful for environmental studies, especially when scanning semi-transparent objects like forests, where the laser light is able to penetrate the vegetation through small openings in the foliage. The phase-shift technique, in contrast, only delivers a single return per pulse. 2.6.1.2 Scanning

As in traditional ALS, sampling of the Earth’s surface with UAV-based laser scanning is ac-complished based on flight strips. Areal coverage with 3D points requires (i) the forward motion of the UAV platform and (ii) a beam deflection unit systematically steering the laser rays below or around the sensor. Figure 2.6-2 shows typical beam deflection mechanisms used in ULS.

204

vv

LRFv

LRFvvvvv

Figure 2.6-2: Mechanical beam deflection strategies used in UAV-based laser scanning.

Assuming both horizontal terrain and horizontal forward motion of the platform with constant velocity, a rotating multi-faced polygonal wheel produces parallel scan lines on the ground ap-proximately perpendicular to the flight trajectory. The constant rotation of a mirror polygon yields an approximately constant point distance along the scan line within a typical FoV of ±30° around the nadir direction. By adjusting rotation speed (scan rate), flying velocity, and pulse repetition rate (PRR), a homogeneous point pattern on the ground can be achieved, both along and across track (Figure 2.6-2a).

Panoramic scanning in vertical scan planes is achieved using a scan wedge, where the mirror plane is tilted by 45° with respect to a horizontal rotational axis. As the laser scanner is typically mounted below the UAV, the full circle of laser beams is restricted in practice to approximately 230°. This still allows scanning even above the horizon, which is beneficial in the context of environmental mapping, e.g., to acquire narrow canyons or riverside vegetation. Concerning the homogeneity of the point pattern, the same as for polygonal wheels applies for the na-dir area (±30°). Due to panoramic scanning, the swath is much wider and is only limited by the maximum measurement range of the sensor. The point spacing decreases with increasing distance from the strip center and with larger ranges the size of the laser footprints increases (Figure 2.6-2b).

In contrast to scanning in vertical planes as described above, oblique scanning with a constant laser beam off-nadir angle results in a spiral-shaped scan pattern on the ground. This is, e.g., implemented by employing a rotating scan wedge with a tilted rotational axis (Palmer scanner, Figure 2.6-2c). Palmer scanners are especially used in laser bathymetry with off-nadir angles

205

between 15–20°, as this is the optimum trade-off for receiving reflections from the water surface as well as for penetration of the laser signal into the water column (Guenther et al., 2000). For topographic applications, oblique scanning enables to look under bridges and potentially pro-vides more returns from facades, depending on building height, road width, and laser beam tilt. It also provides a forward and backward look in the same scan line (more precisely: scan circle or ellipse), thus hitting objects from different viewpoints. However, this double-look feature dimin-ishes from the center towards the border of the strip. A downside of this scanning mechanism is the inhomogeneous point distribution with a much higher density on the border compared to the center of the strip, which needs to be appropriately considered during data processing.

Oscillating mirrors constitute an alternative to constantly rotating mirrors or polygons (Fig-ure 2.6-2d). The mirror constantly swings between two positions. The extreme mirror positions mark the border of the strip. Due to the necessary deceleration at the end of the swing, the point density is higher at the border of the strip compared to the center, as it is the case for Palmer scanners.

A completely different scanning approach is persued by a technique referred to as solid-state hybrid lidar (Frost et al., 2016) or rotating multi-beam LiDAR, respectively. In this context, sol-id-state means that no rotating or oscillating device is used to deflect the laser beam in different directions, but the entire laser unit spins around an axis. As this technology often operates a fan of laser range finders (8/16/32/64/128 channels) in parallel, the term profile-array scanner is used in the following. In ULS, scanner integrations with horizontal rotation axes are preferred enabling panoramic scanning similar to Figure 2.6-2b, but with multiple laser channels and, hence, multiple scan lines per revolution. This potentially increases the capturing rate by a factor of n, with n=number of laser channels. Figure 2.6-3 illustrates the general principle. v

Laser range finder

Figure 2.6-3: Scanning principle of profile-array laser scanners.

In all strategies shown so far, an individual detector receives the backscattered signal from a single narrow laser shot. In contrast to that, so-called flash LiDAR or focal plane LiDAR sensors use a broad laser pulse and the backscattered signal illuminates an array of receivers. These sys-tems are also termed ToF cameras or range cameras, as the result of a single laser pulse is a range image (Hansard et al., 2012). Thus, no scanning in the above sense using rotating elements is

206

required to obtain arial coverage, which enables extremely compact and lightweight design (~100 g). Due to the limited measurement range on natural targets (< 50 m), flash LiDAR is not further considered here but with ongoing development it is likely to become an option in UAV-based LiDAR mapping for environmental applications in the future. 2.6.1.3 Laser beam model

While the ideal laser shot is infinitely short and narrow, in practice typical UAV LiDAR sensors exhibit a laser pulse duration in the range of 1–6 ns corresponding to 30–180 cm in metric units and feature a laser beam divergence of around 0.5–3 mrad resulting in a laser footprint on the ground of 2.5–15 cm for a flying altitude of 50 m above ground level (AGL).

The energy distribution in longitudinal and radial direction (along and across the laser beam direction) is commonly described as a Gaussian function (Jutzi & Stilla, 2005; Słota, 2015). Fig-ure 2.6-4 shows a conceptual drawing of the energy distribution within a laser beam and the corresponding mathematical formulation is provided in eq. 2. r t

Figure 2.6-4: Leaser beam model (adapted from (Słota, 2015)). ( )+

I t rI e2 I 0 denotes the peak energy level which is reached at temporal position t = 0 and radial posi-tion r = 0, i.e. along the laser axis in the middle between rising and falling of the laser energy.

207

I 0 decreases exponentially from this center point both along and across the laser line of sight.

The drop level depends on the standard deviation of the Gaussian curves (longitudinal: σ tang

, radial: σ rad

). In signal processing, pulse duration and size are often described by the so-called “full width at half maximum” (FWHM), i.e. the range when the signal has dropped to the half of its maximum. The following relations between FWHM and standard deviation apply for the longitudinal and radial direction:

=22 2 ln2 tangw3

==22 2 ln2 radsR4The pulse duration w (eq. 3) directly influences the range discrimination distance, i.e. the capa-bilitity to separate two consecutive objects illuminated by the same laser beam along the beam path (e.g., two branches of a tree, shrub and ground below, etc.). As a rule of thumb, the min-imum time, or distance, respectively, to separate two individual laser echoes is dt = w/2 . For a typical pulse duration of 3 ns, the range discrimination distance in metric units is approximately 45 cm.

From eq. 4 it can be seen that the size of the illuminated area s (i.e., the laser footprint diame-ter) depends on both the measurement range R and the beam divergence γ . The size of the laser footprint inherently limits the spatial resolution of any LiDAR system. As ALS and ULS sensors exhibit comparable beam divergence measures, the spatial resolution of ULS is higher by an order of magnitude due to shorter measurement ranges. 2.6.1.4 Signal detection and waveform processing

In conventional ToF laser ranging, the return signal of a highly collimated laser pulse is received by a single detector. For the conversion of the optical power into digital radiometric informa-tion, a two-stage procedure is employed (Ullrich & Pfennigbauer, 2016). First, an Avalanche Photo Diode (APD) converts the received laser radiation into an analog signal, and subsequently an Analog-Digital Converter (ADC) generates the final measurement in digital form. APDs used for UAV-based laser scanning operate in linear-mode, i.e. the dynamic range of the APD where the optical power and the analog output are linearly related. Such APDs deliver measures of the received signal strength and provide object reflectance and/or material properties of the illuminated objects via radiometric calibration (Briese et al., 2012; Wagner, 2010).

The actual range detection is either implemented by hardware components of the laser scan-ner (discrete echo systems) or by high-frequency discretization of the entire backscattered echo

208

waveform. In the latter case, the captured waveforms are either processed online by the firmware of the sensor (Pfennigbauer et al., 2014) or stored for detailed analysis in postprocessing (Mallet & Bretar, 2009; Shan & Toth, 2018). To date, some existing ULS sensors feature full waveform acquisition with entailed advantages w.r.t. ranging precision, target separability, and object char-acterization (amplitude, echo width, reflectance, etc.). A detailed discussion of full waveform laser scanning is beyond the scope of this book. More information is found in subject literature (Jutzi & Stilla, 2005; Mallet & Bretar, 2009; Wagner et al., 2006). 2.6.1.5 Geometric sensor model x s y s z s x i y i z s R n i x i y i z s R n i R n i

INSscanner

GNSSlever armmisalignment

WGS84

(geocenter) y e x e z e z e

Figure 2.6-5: Conceptual drawing of ALS/ULS sensor model based on: (Glira et al., 2015b).

UAV laser scanning is a kinematic measurement process based on a tightly synchronized mul-ti-sensor system consisting of a Global Navigation Satellite System (GNSS) receiver, an Inertial Navigation System (INS), and the laser scanner itself. INS sensors are also termed IMU (Inertial Measurement Unit). The computation of georeferenced 3D points is called direct georeferencing and is illustrated in Figure 2.6-5.

In a preprocessing step, Kalman filtering (Grewal et al., 2013) is employed to merge GNSS and IMU observations resulting in a so-called Smoothed Best Estimate of Trajectory (SBET). A Kal-man filter integrates the individual positional and inertial measurements over time in a linear quadratic estimation framework considering statistical noise and other sources of inaccuracies. It delivers the absolute 3D positions (X, Y, Z) of the measurement platform in a geocentric,

209

Cartesian (Earth-Centered-Earth-Fixed, ECEF) coordinate frame as well as the attitude of the measurement platform w.r.t. to the local horizon (navigation angles: roll, pitch, yaw). GNSS typically provides positions with a rate of 1–2 Hz corresponding to a point distance of 4–8 m for a typical UAV flight velocity of 16 knots (~8 m/s). The INS measurement rate in turn, is much higher (100–500 Hz) and both 3D positions and attitudes are estimated for each timestamp (t) of the higher IMU-frequency within the Kalman filter resulting in a typical point spacing of consecutive flight trajectory points of 1.6–8 cm.

The trajectory data are subsequently combined with the time-stamped laser scanner meas-urements. In general, the raw range and scan angle measurements are not directly provided by the sensor manufacturers, as small corrections are applied to the raw data compensating systematic instrument effects which are calibrated in the manufacturer’s lab (irregularities of the scan mirrors, amplitude dependency of range measurement, etc.). This internal calibration leads to 3D coordinates of the detected objects (i.e. laser echoes) in the sensor coordinate system and constitute the basis for the calculation of 3D object coordinates in an ECEF coordinate system according to eq. 5 ( )( )( ) ( )( )()=++eeenii s nisx tg tR t R t a R x t5The transformation chain in eq. 5 transforms between the following coordinate systems (CS), each denoted by a specific index and highlighted by a specific color in Figure 2.6-5.

• s/blue: scanner CS

• i/red: INS CS, also referred to as body CS or platform CS

• n/no color: navigation CS (local horizon: x=north, y=east, z=nadir)

• e/magenta: ECEF (earth-centered earth-fixed) CS

Reading eq. 5 from right to left, x s( x s , y s , z s ) is a 3D vector denoting the coordinates of a laser point in the local scanner CS which is rotated by the boresight angles into the INS system (isR) and shifted by the lever arm ( a i ). The lever arm is the offset vector between the phase center of the GNSS antenna and origin of scanner system, and the boresight angles denote the small an-gular differences (Δroll, Δpitch, Δyaw) between the reference plane of the scanner and the INS (cf. green elements in Figure 2.6-5). While the lever arm can be measured on the ground with a total station, the boresight angles are determined within strip adjustment based on data from a calibration flight (Hebel & Stilla, 2012; Skaloud & Lichti, 2006). niR transforms the resulting vector from the INS CS to the navigation system based on the IMU measurements (roll/pitch/yaw), and enR rotates to the cartesian ECEF system. The latter rotation depends on the geograph-

210

ical position (latitude/longitude) of the INS origin. The 3D coordinates of the laser point x e ( t ) are finally obtained by adding the ECEF coordinates of the GNSS antenna ( g e ).

The total positional and vertical uncertainty (TPU/TVU) of ULS-derived 3D points depends on the accuracy of both the laser scanner and the trajectory as well as on the synchronisation of all sensor components (GNSS, IMU, scanner). Compared to ALS based on manned aircraft, the accuracy demand for the angular components (scan angle, platform attitude) is lower for ULS due to the shorter measurement ranges. For this reason, the employed INS sensors are typically less accurate (roll/pitch: ~0.015°, yaw: 0.035°) as used for ALS. GNSS errors, however, directly translate to respective errors in the ULS point clouds, thus, equally accurate GNSS receivers are required for ALS and ULS. 2.6.1.6 Radiometric sensor model

Information about the radiometric properties of illuminated objects are of high importance for environmental applications. The laser-radar equation describes the fundamental relationship between the emitted and the received optical power (Pfeifer et al., 2015; Wagner et al., 2006): ( )2 22

/ 4

4/ 4

=+6The received power P R depends on the transmitted power P T

, the measurement range R , the laser beam divergence γ , the size of the receiver aperture D , the radar cross-section σ , as well as factors related to system losses η SYS and atmospheric attenuation η ATM

. P BK

, finally, indicates solar background radiation that deteriorates the signal-to-noise ratio.

The laser-radar cross-section σ incorporates all target properties and can be separated into the illuminated target area A , the object’s reflectance ρ , and the backscattering solid angle Ω .

4A=7 Ω denotes the opening angle of a cone into which the laser signal is reflected. Specular reflection is characterized by a narrow cone (i.e. small values of Ω ). Most of the natural targets (soil, grass, trees, etc.) as well as sealed surfaces (asphalt, concrete) are diffuse scatterers. For ideal diffusely reflecting targets ( Ω = 180o), Lambert’s cosine law is applicable.

The cross-section further depends on the illuminated area A , which is a function of the meas-urement range R , the beam opening angle γ , and the incidence angle α between the laser beam

211

and the normal direction of the illuminated surface. For extended targets larger than the laser footprint, the area calculates to (Roncat et al., 2016; Roncat et al., 2012): ( )= cos4 cosL

A L is the projection of the effectively illuminated target area to a plane orthogonal to the laser beam direction which only depends on the measurement range R and the laser beam opening angle γ . Inserting eqs. 8 and 7 into eq. 6 reveals a decrease of received power with the squared sensor-to-target distance ( R 2). Linear targets (e.g. power lines) crossing the laser footprint, in turn, exhibit a R 3 relationship and the signal loss corresponds to R 4 for point features (e.g. leaves).

LiDAR sensors do not directly measure the received optical power P R

, but especially full wave-form laser scanning provides the signal amplitude and the width of the return echo which to-gether are proxies for P R

. Simple correction strategies account for the dominating range effect to correct the received signal strength measurements (Höfle & Pfeifer, 2007), while rigorous radi-ometric calibration use external radiometric reference measurements to obtain object properties like backscattering cross-section, backscattering coefficient, or object reflectance (Briese et al., 2012; Kaasalainen et al., 2011; Kashani et al., 2015; Wagner, 2010) .

The laser-radar equation only applies in the far field starting at a range of about 50 m. While UAV flying altitudes are seldomly lower than 50 m above ground level due to safety consider-ations, still objects standing out from the ground (e.g. buildings, trees, power line towers, etc.) may well result in measurement ranges smaller than 50 m. In this case, sensor manufacturers often provide look-up tables describing the relation between signal strength and short measure-ment range. 2.6.1.7 Flight planning

ULS data capturing is generally carried out based on individual flight strips. For areal data acquisition a setup with longitudinal strips for areal coverage and occasional cross strips for block stabilization (cf. Figure 2.6-6a) constitutes best practice. Adjacent flight strips typically exhibit an overlap area of 20–50 %. Corridor mapping often requires a more flexible flight plan with the strips aligned to the object of study (cf. Figure 2.6-6b). Also in this case, sufficient overlap of consecutive strips is crucial for enabling proper quality control and stabilizing the block geometry.

212

Figure 2.6-6: Flight strip setup for areal survey (a) and corridor mapping (b). Th e areal setup consists of six longitudinal strips and three cross strips. In the corridor setup, the fl ight strips follow the river course and strip overlaps are provided at the junction points for block stabilization.

Th e most relevant parameters for planning a fl ight are (i) the swath width of the individual strips and (ii) the intended laser pulse density:

Th e swath width SW (eq. 9) relates to the fl ying altitude h and the scanner’s FOV . Th is applies to scanners with a fi nite FOV. For panoramic 360° scanners, SW is only restricted by the max-imum range. Th e mean pulse density PD (eq. 10) is directly proportional to the eff ective meas-urement range MR and indirectly proportional to the swath width and the fl ying velocity  v . For Palmer scanners and scanners with an oscillating mirror, MR corresponds to the PRR , while in most other cases MR<PRR. Th e latter is typically the case for scanners with rotating polygons or 360° scanners. For the prior, eq. 10 denotes the mean pulse density as an average of very high density at the strip boundary and more representative lower pulse density in the middle of the strip.

213

2.6.1.8 Quality control and sensor orientation

Within the common area of two flight strips, the ground surface and objects thereon are meas-ured independently from different viewing points. Deviations in the overlap area are an indica-tor of the sensor calibration quality. Especially smooth and inclined surfaces (slopes, embank-ments, roofs, etc.) are well suited to detect potential sensor calibration problems. Deviations can either be measured as strip height differences based on strip-wise, gridded Digital Elevation Models (Ressl et al., 2008) or based on the 3D point clouds by calculating the distances of points in one strip from the planes constructed from the neighbouring points of the overlapping strip (point-to-plane distances). If (i) the residual errors are larger than the nominal accuracy of the employed sensors or (ii) systematic errors occur, re-calibration of the sensor system and orien-tation of the flight strips becomes necessary.

Depending on data availability, either approximative methods (Ressl et al., 2011) or rigorous approaches (Glira et al., 2019; Glira et al., 2015a; Pfeifer et al., 2015; Skaloud & Lichti, 2006) can be employed. Approximative methods typically start with the geo-referenced 3D point cloud and try to minimize the (height) deviations in the strip overlap area. Rigorous approaches, in turn, are based on the geometric sensor model (cf. chapter 2.6.1.5), and utilize the raw measure-ments (i.e., flight trajectory and the coordinates of the laser echoes in the sensor’s CS) to estimate the sensor calibration parameters. The most important parameters are (Glira et al., 2016):

• mounting calibration (lever arm and boresight angles)

• scanner calibration parameters (range and scan angle offset and scale)

• trajectory correction parameters (constant offsets, drifts, time dependent correction terms of higher order)

• datum shift parameters

Sensor calibration and strip adjustment of ULS does not generally differ from the strategies applied for manned ALS-platforms. UAVs, however, are not as stable as manned aircraft and hence prone to sudden fluctuation in position and attitude, especially when flown in windy conditions. Keeping in mind that the highest-class INS sensors are seldomly used for ULS, local deviations of the 3D point clouds of overlapping flight strips are likely. To compensate these short-term effects, a cubic spline-based trajectory correction is proposed by Glira et al. (2016) for UAV-flights with sufficient strip overlap and control patches. An optimum sensor calibration and orientation strategy would incorporate the raw GNSS and INS measurements, but this is still subject to scientific research (Cucci et al., 2017). Glira et al. (2019) extended the concept of pure laser scanning strip adjustment to hybrid sensor orientation including camera sensors. Including correspondences between laser strips and image tie points in a comprehen-

214

sive integrated adjustment framework has proven to improve the trajectory estimation (Glira, 2018). 2.6.2 UAV-LiDAR sensor concepts 2.6.2.1 Sensor overview

Table 2.6-1 provides an overview of existing compact laser scanners suited or even designed for integration on UAVs. The specifications are taken from company brochures. In case of different operation modes, the reported values always denote the mode with the highest measurement rate. The maximum range depends on the object’s reflectivity, where at least ρ≥60 % is assumed. The listed precision (prec) and accuracy (acc) numbers relate to the ranging component only. For better readability, the beam divergence measures are also expressed as footprint diameters on the ground assuming a flying height of 50 m AGL.

Table 2.6-1: UAV-LiDAR sensor specifications.

(1)(2)(3)(4)

(5)(6)(7)(8)

Figure 2.6-7: Selected UAV-lidar sensors. 8 Images from Riegl, Velodyne Lidar and Teledyne Geospatial; used with permission – all rights reserved.

In general, different categories of LiDAR sensors are available. Extremely lightweight sensors (< 1 kg) enable longer flight endurance but are typically less accurate (3 cm) and exhibit a larg-er footprint diameter in the dm-range. Such sensors can be integrated on small UAS (sUAS) platforms with a maximum take-off mass (MTOM) < 10 kg. Sensors delivering survey-grade precision in the cm range typically weigh around 4 kg and, thus, require larger UAVs with a MTOM of around 25 kg.

The sensors listed in Table 2.6-1 use the scan mechanisms shown in Figure 2.6-2 and Fig-ure 2.6-3. All sensors except (5) and (6) are conventional linear-mode LiDAR systems with a single laser channel and mechanical beam deflection with a rotating polygonal wheel (3), rotat-ing wedge (1, 2, 8), oscillating mirror (7), or conical scanning (4). Sensors (5) and (6) are pro-file-array scanners with 32 or 128 jointly rotating laser channels. Most of the cited sensors use panoramic scanning (FoV=360°) with (near) horizontal rotation axes allowing to capture verti-cal structures to both sides of the scanner in narrow valleys, street canyons, and river corridors. It is a clear advantage of agile UAVs to operate in such demanding scenarios. The down side of this scanning mechanism is that, in most cases, the surfaces and objects of interest are located beneath the UAV. Hence, concentrating the emitted laser pulses to a smaller FoV would increase the effective measurement rate, as this is the case for scanners (3), (4) and (7).

The measurement rates ranging from around 200 kHz (2, 4, 5) to more than 1 MHz (3, 6) result in point densities on the ground in the order of 50–500 points/m2 depending on flying al-titude, flight velocity, and FoV. ULS is therefore well suited for deriving Digital Elevation Models with a grid spacing of 5–10 cm (Escobar Villanueva et al., 2019; Mandlburger et al., 2015). It is noted that the spatial resolution generally depends on both the point spacing and footprint size and is always limited by the larger of the two. Thus, when choosing the right scanner for a certain application, both aspects need to be taken into account. 2.6.2.2 Topo-bathymetric sensors

Most sensors listed in Table 1 are topographic scanners based on infrared wavelengths. Sensor (4) is a topo-bathymetric scanner employing a laser operating in the visible green domain of the spectrum (532 nm). At this wavelength, laser light is able to penetrate the water column and measure the ground of the water bodies. While airborne laser bathymetry based on manned aircraft is well suited for mapping clear and shallow coastal areas and larger inland water bodies, the spatial resolution is moderate as relatively broad laser beams are employed to ensure eye-safe operation. The main advantages of UAV-borne LiDAR bathymetry are (i) the potentially higher planimetric resolution and (ii) the agility of the UAV platforms. The latter make UAV-based bathymetry an upcoming technique for mapping smaller water bodies like ponds and medi-um-sized rivers featuring a meandering course.

The depth performance of topo-bathymetric sensors is often defined in multiples of the Secchi depth (SD). Secchi depth is an empirical measure for water turbidity and denotes the distance where the black and white quadrants of a 30 cm checker board disk lowered into the water can no longer be separated. Sensor (4) constitutes a survey-grade topo-bathymetric sensor featuring a maximum depth penetration of 2 SD. However, the sensor requires a powerful UAV platform with a MTOM of around 35 kg. Complementary to sensor (4), more lightweight instruments (5 kg) with a depth penetration of around 1 SD are available too. Such sensors are suitable for capturing small and very shallow clear water rivers. A comprehensive review of existing topo-ba-thymetric sensors can be found in (Mandlburger et al., 2020).

217

Figure 2.6-8 shows examples for integrations of LiDAR sensors on various UAV platforms. The choice of the appropriate type of UAV depends on the payload capacity and the targeted flight endurance. In general, multicopter, helicopter, and fixed wing UAVs are potentially suited for UAV-LiDAR integrations, but multicopters (quad-, hexa-, and octocopters) are most often uti-lized.rotorsGNSS+ radio data link antennasbatterieslaser scanner (1) + IMUcameraslaser scanner (4)camera or

IR laser octocopter UAV

(MTOM: 25 kg)octocopter UAV

(MTOM: 35 kg)hexacopter UAV

(MTOM: 15.5 kg)profile array laser scanners (5) + IMU

GNSS antennasSource: 4D-IT, www.4d-it.com

Fixed wing UAV (MTOM: 15.5 kg)Source: Quantum systems, https://www.quantum-systems.com; YellowScan, https://www.yellowscan-lidar.com/profile arraylaser scanner (5)

(a)(b)

(c)(d)cameras

Figure 2.6-8: UAV-LiDAR sensor integration examples. 9 Images from 4D-IT and

Quantum Systems; used with permission – all rights reserved.

Figure 2.6-8a depicts a full-featured hybrid sensor system consisting of a panoramic laser scan-ner (1), two oblique cameras, and a navigation system (GNSS+IMU) integrated on an octocop-ter-UAV with a MTOM of 25 kg. The maximum flight time of this survey-grade sensor system is around 30 min and the targeted application include agriculture, forestry, archaeology, corridor mapping, monitoring of landslides and open-cast mines, and urban mapping.

Figure 2.6-8b shows a bigger version of the same UAV-type (octocopter, MTOM: 35 kg) carry-ing a topo-bathymetric sensor (4). The inertial navigation sensors and an optional camera or IR laser range finder are combined and tightly coupled in a compact housing. The targeted applica-tions for UAV-based LiDAR bathymetry include flood modelling, habitat mapping, monitoring of morphodynamics, roughness estimation for hydrodynamic-numerical modelling, etc.

9Source: (c) 4DIT. (d) Quantum systems, YellowScan.

218

The system shown Figure 2.6-8c comprises two ultra-lightweight profile-array laser scanners (5), two oblique looking RGB cameras and a navigation system (GNSS+IMU) mounted on a hexacopter UAV. The system features a maximum flight time of 40 min and is used for terrain modelling, vegetation mapping, geological mapping, documentation of pit-mining activities, or land slide monitoring, and for high-quality 3D documentation of building facilities, industrial sites, or archaeology.

The final example depicted in Figure 2.6-8d shows the same scanner as above (5) integrated on a fixed wing UAV featuring only limited payload capacity (700 g) but long flight endurance. In general, fixed wing UAVs outperform multicopters in terms of flight endurance, which is es-pecially useful when operated BVLOS. The depicted system features a flight time of 90 min, and offers telemetry capabilities within a radius of 7 km. Together with the generally faster cruising speeds compared to multicopter platforms, fixed wing UAVs are best suited for large-area map-ping with very high spatial resolution. This is especially beneficial for capturing vast uninhabited forest areas, for which BVLOS flight permissions are easier to obtain compared to populated areas.

References for further reading

220

2.7 Other UAV sensors

Elisa Casella and Alessio Rovere

2.7.1 Atmospheric variables and aerosol ........................................................................................ 2202.7.2 Natural gas emissions and pollutants .................................................................................... 2222.7.3 Magnetometry .......................................................................................................................... 224Regardless of the application, UAVs are largely employed in environmental sciences for a sim-ple reason: they allow researchers or professionals interested in a given process or physical element to gather observations giving, at the same time, some logistical advantage. The ad-vantage given by UAVs might be measurable in terms of cost (e.g., compared with airborne data acquisition), resolution, rapidity, feasibility and accessibility of the site that one intends to study.

UAVs are most often used in environmental studies to gather low-cost and high-resolution visible or multispectral aerial imagery of one or more elements of interest within the environ-ment. Typical examples are the reconstruction of orthophotos and Digital Elevation Models starting from overlapping RGB photos processed with photogrammetry methods (chapter 2.2), or the collection of multispectral imagery (chapter 2.5) to assess the health of crops or to assess ecosystem status (Salamí et al., 2014). These are examples of how UAVs can be used to collect ground data, that are largely treated in this volume.

In this chapter, an overview on other use of UAVs in environmental science is presented. Due to the rapid increase of the use of UAVs in different environmental fields and the existing broad applications, it will not be possible to cover all the other environmental uses of UAVs, but, in this chapter, we aim at mentioning some of the established ones. To note is here we only give a broad overview of these “other” environmental applications, with the necessary references where more detailed information can be found.

A well-established use of UAVs is certainly the study of the air column. Atmospheric sci-entists have pioneered the use of UAVs to study atmospheric phenomena (Gottwald & Ted-ders, 1985; Konrad et al., 1970). This resulted in new insights into atmospheric processes,

221

that were first hitherto difficult to reach with traditional techniques (e.g. Ramanathan et al., 2007). In literature, most examples of UAV use relate to operations involving a single drone (Martin et al., 2011; Rautenberg et al., 2018 and other), but there are examples where UAVs are flown in a swarm to collect, for example, multi-layer data (Han et al., 2013; Ramanathan et al., 2007).

The brief overview presented here is based upon the extensive work of Villa et al., 2016, enti-tled “An Overview of Small Unmanned Aerial Vehicles for Air Quality Measurements: Present Applications and Future Prospectives”. These authors reviewed 60 papers on the subject of “Air quality monitoring”, and report a detailed description for each one of them, including both UAV platforms and associated sensors. Villa et al.’s review (2016) is subdivided into three broader sub-topics: “Study of Atmospheric Composition, Pollution and Climate Change”, “Measurement of Surface, Interior and Atmospheric Phenomena”, and “Measurements for Prevention, Patrolling and Intervention”. These are herein summarized into two broader topics: “Atmospheric variables and aerosol” and “Natural gas emissions and pollutants”. In addition to those two topics, we briefly explore the use of UAVs in the magnetometry field. 2.7.1 Atmospheric variables and aerosol

UAVs can mount sensors to measure atmospheric variables such as wind intensity and direction (Reuder et al., 2012 and other), even within extreme weather events such as typhoons (Lin et al., 2008). UAVs can also be employed in the measurement of temperature, humidity, pollens and other variables at different altitudes (Brosy et al., 2017; Renzaglia et al., 2016; Aylor et al., 2006 and other) and to assess the concentration of aerosols and greenhouse gases within the atmos-pheric boundary layer (ABL) or above it (Villa et al., 2016). As an example, Watai et al., 2005 used UAVs can be used to measure temporal and spatial variations of atmospheric CO2 in and above the ABL. As a matter of fact, they measured temporal and vertical variation of CO2 from about 650 m to 2,000 m a.g.l. (above ground level). In 2012, Mayer et al., used the data collected by an UAV to evaluate the ABL parameterization schemes of the Advanced Weather Research and Forecasting model (AR-WRF). The UAV provided vertical profiles of temperature, relative humidity and wind from the ground to about 3,500 m a.g.l. This, not only proved the capability of the UAV to catch the relevant physical processes of the diurnal evolution of the ABL, but also the crucial value of the UAV data for the detailed validation and development of the parameteri-zation schemes used in forecasting numerical models. To give the reader the necessary referenc-es of different applications, Table 1 provides the list of variables measured using different sensors mounted on UAVs. The table presents the variables measured and the dedicated instrument/sensor used on a UAV in the specific study.

222

Table 2.7-1: Atmospheric variables and aerosol measured by different instruments mounted on UAVs.

The concept of mounting sensors on UAVs to assess or monitor in time and space the chemi-cal composition of volcanic plumes has been developed for nearly 15 years (Caltabiano et al., 2005). In a recent review, Jordan et al., 2019 (their section 6) gives an overview of the studies that have employed different sensors to measure volcanic gases such as sulfur dioxide (SO2

) and carbon dioxide (CO2

). Other than natural gas emissions, UAVs have been employed to monitor pollutants or other elements that pose a risk to human health. Within this context, specific sen-sors can be used to measure industrial or radioactive pollutants. Han et al., (2013) showed that fixed-wing UAVs flying in formation can be used to detect nuclear radiation using a lightweight (< 600 g) radiation detection sensor. In radiation surveillance the advantage of using UAVs is well evident since dangerous missions can be carried out safely from remote locations while the drone reaches contaminated areas and measure the level of radiation. Pollanen et al., (2009) used an air sampler and a gamma-ray spectrometer mounted on a mini-UAV to detect alpha-particle emitting radionuclides. Other sensors can be used to measure the concentration of atmospheric aerosol particles. Brady et al., (2016) used an optical particle counter and a CO2 sensor mounted on a quadrotor UAV to measure vertical and horizontal concentration gradients of CO2 and Particulate Matter (PM, both small-size, with diameter between 0.5 and 1 μm, and larger size, > 1 μm) at high spatial resolution (1 m). It is worth noting that PM is a significant health hazard (Anderson et al., 2011) and also it plays a central role in Earth’s radiation budget (IPCC, 2013). Harrison et al., (2015) successfully measured airborne PM concentration using an aerosol spec-trometer and an intake probe mounted on a UAV. Another application on air quality monitoring is reported in the work of Gonzalez et al., (2011) where they developed a prototype spore trap onboard an UAV which successfully captured and geolocated spores of pathogens in the air.

224

Table 2 gives an overview of the literature on natural gas and pollutants variables measured by specific sensors mounted on UAV platforms.

Table 2.7-2: Natural gas and pollutants variables measured by different instruments mounted on UAVs. Magnetometry is used in numerous geophysical applications as a form of site investigation of mineral exploration, infrastructures tracking, unexploded ordnance detection and other ap-plications involving magnetic field anomalies. UAVs mounting a magnetometer can exploit a higher rate of coverage than terrestrial magnetic surveys at a higher resolution than manned airborne surveys (Walter et al., 2019). The use of UAVs in this field is very challenging due to the difficulties in separating the magnetometer signal from the UAV platform to minimize the UAV’s magnetic field influence on the observations. Particularly apt platforms for magnetome-try are multicopters, which allow to suspend a magnetic sensor away from the aircraft to avoid these effects (Walter et al., 2019). This reduces significantly the magnetic field contributions from the UAV platform (Malehmir et al. 2017, Parvar et al. 2018). Walter et al. (2019) investigat-ed the performance of optically pumped vapour magnetometers suspended under a UAV. They concluded that once the magnetometer sensor axis has been optimally oriented with respect to the Earth’s magnetic field, the attitude variations of the sensor in pitch and roll are not significant contributors to magnetic data loss. Instead, yaw axis variations, if unrestricted, contribute to magnetic data loss. They demonstrated that fixing the magnetometer yaw axis allows to collect data within the industrial standard. Among the applied examples of UAV magnotometry, we report that of Parvar et al. (2018), who used it a magnetometer mounted on a UAV to detect chromite (a mineral used as source of chromium for stainless steel production) by mapping the serpentinite rocks surrounding it.

References for further reading

228

3. Data Analysis

229
230

3.1 Data formats

Pierre Karrasch and Matthias Müller

3.1.1 The choice of data formats – why does it matter? ................................................................ 2293.1.2 Point cloud data formats ......................................................................................................... 2313.1.2.1 ASCII data formats .................................................................................................... 2313.1.2.2 Binary data formats .................................................................................................... 2323.1.3 Image data formats................................................................................................................... 2333.1.4 Decision checklist .................................................................................................................... 235 3.1.1 The choice of data formats – why does it matter? UAVs themselves do not actually collect any data. It is much more about the data gener-ated by different sensors carried by the UAVs. Chapters 2.4–2.7 already provides a broad overview of different sensor techniques carried by different systems like copters or fixed wings. Thus, the aim of this chapter is to give a fundamental overview about data structures and formats generated by different sensor technologies. The next subchapters start with an introduction of various point cloud data formats followed by a discussion of image data formats.

To facilitate the use and exchange of collected data, commonly used data formats follow or should follow some basic principles. The use and exchange of data is particularly successful if the data formats are open in the broadest sense. It is important that specific license terms do not bind working with the data to specific software products or impose additional fees on their users. These formats should also be backed by a large community of developers and users to ensure long-term availability and software support. Data formats should allow to store the data in a compact an efficient way, but also ensure that lossless compression algorithms can be applied. This is especially crucial if the data should be used for computational analysis in later

231

processing steps. A common example is the analysis of radiometric information (e.g. from camera images) where lossy compression may introduce artifacts that have a negative effect on feature extraction or the computation of band indices. Closely related are considerations of numerical accuracy and value ranges. Low-precision floating-point data (16 and 32 bit) can easily produce computational inaccuracies in the percent-range. If possible, sufficiently large floating-point types (e.g. 64 or 128 bit) or precise integer or decimal values are preferred. Final-ly, a suitable data format should permit fast reading under various access patterns. Data tiling or partitioning divides the whole data set into subsets to support fast seek operations. The integration of pre-generated image pyramids provides generalized representations at different spatial resolutions. Both mechanisms enable fast access to subsets of the data without having to load the whole data set.

For practical applications, the data format is often determined by the (often proprietary) sen-sor technology and the software used for post-processing and data analysis. Both usually restrict the choice of data formats and may even lead to various problems in data analyses if the overlap is very small and only leaves poor choices. If the technology is less restrictive, the choice of the data format can be determined by the task to achieve optimal results and a seamless processing workflow.

The choice of the data format deserves some consideration, incorporating the immediate task as well as storage and computing aspects and suitability for future applications which may not be fully known at the time of data recording. A poor choice may lead to sub-opti-mal results, uneconomic use of resources or a restriction of downstream applications of the data. The following example should outline such a case: The storage of UAV images in the JPEG format, which uses lossy compression and usually encodes data in eight bits per RGB channel, is very economic but immediately heavily degrades the radiometric accuracy of the stored images. This downgrade removes or modifies finer structures in the image, alters some pixel values to obtain better compression and introduces artificial patterns that result from the lossy encoding. Any subsequent image classification task will not only produce sub-op-timal results – it must also be carefully analysed to identify areas where the classification algorithm was deceived by artificial patterns. Especially for a temporal comparison, this can cause considerable constraints, which in the worst case leads to the necessity to repeat the UAV flights.

To save time and effort in data acquisition UAV data are very often re-used for other tasks later on. However, this is only possible if a further use has already been anticipated. Hence the selection of a suitable data format must be part of every flight planning. The following chapters provide decision support for picking the right data format. They cover data formats for point clouds as well as image data formats.

232

At first glance, the primary structure to store is that of a point. Essentially, a point cloud is a set of points in vector space. This set of points is usually unorganized and lacks any topological information. Such point clouds can be captured directly by laser scanners installed on the UAVs, or they can be the product of data processing of 2D images that have been run through Structure from Motion algorithms (SfM, chapter 4.3).

The data formats mentioned in this chapter are certainly not complete and represent only a fraction of the total data formats available for point cloud data. Nevertheless, the essential for-mats are presented here.

The file formats for point clouds can be divided into two large groups – ASCII formats and binary formats – with individual advantages and disadvantages. 3.1.2.1 ASCII data formats

Data formats based on ASCII characters store the point information as lines of text. In the sim-plest form, each line stores the x, y, z value of the spatial coordinate of each point. More complex formats may contain additional information such as intensities or even colour values for each point. The main advantage of ASCII data formats is plain text and easily to understand by read-ing. In many cases, a simple text editor is sufficient. With a view to the short-lived nature of data formats and the associated risk, that data once collected will no longer be readable in the future, ASCII data formats are of particular importance when storing data over long periods of time. But there are also disadvantages associated with the ASCII data format: The use of text charac-ters reduces data density and leads to a rapid increase in the data volume required for ASCII data. In addition, it is not possible, to read spatial subsets of the data record without scanning the whole file. Data in ASCII format must be read line by line. This necessity can considerably limit the speed of data processing, which is especially true for large scenes with high resolution. In the following two common ASCII data formats are introduced briefly.

XYZ: In principle, there is no clear specification standard for this file format. The structure of such a file therefore depends very much on the preferences of the creator (e.g. for the separator of columns: spaces, tabs, commas). It is useful to have columns for the X, Y, and Z coordinates. In theory, the number of columns is unlimited, and these columns can contain additional infor-mation about the points, such as color values. A disadvantage of non-specification is that errors can occur during data exchange because of missing information on measurement units or coor-dinate systems if these are not passed as supplementary information.

233

OBJ: The OBJ data format is not only suitable for storing point information, but could store more complex geometric objects. This ASCII data format uses the definition of elements. These elements include points (p), lines (l), curves (curv), 2D-curve (curv2) and surfaces (surf). The file starts with an ordered set of points where each point has a unique index (first, second, and so on). This index is used in subsequent parts of the file when objects such as surfaces (key: f) are created from these points indices. Many common software products use the OBJ data for-mat. Depending on the application, the point information can also contain colour information, which are appended to each point’s coordinate as additional numbers (0.1). 3.1.2.2 Binary data formats

The use of binary formats avoids some of the disadvantages of ASCII data formats. Binary data encodings are more storage-efficient than ASCII data and also require less bandwidth for transfer. Many binary data formats support spatially indexing making it possible to read, visualize and ana-lyse subsets of the data very quickly. Furthermore, they provide additional structures for metadata, sometimes down to each contained coordinate. The obvious disadvantage of binary data formats, however, is the lack of immediate readability as it was the case with ASCII data, making it less suit-able for data archives. The LAS format is probably the most important binary point data format.

LAS: The LAS data format is a binary point cloud data format offered by the American Society for Photogrammetry and Remote Sensing. Due to the binary storage, a data set in LAS format needs significantly less space than ASCII formats. The point information can be stored in eleven different types of point data records. These types are distinguished mainly by the different availa-ble data fields. Depending on the type of data record, the format also offers the possibility to store additional information about the point data. This includes information about intensity, colour, GPS time and classification. Within a LAS file, all point data records must have the same format.

There are some data formats that try to combine the strengths of binary and ASCII formats. Among the best known are the PLY and E57 data formats that allow point clouds to be stored in both ASCII and binary representations.

PLY: The PLY format is basically based on the OBJ format and was developed especially for storing 3D data. It is also called the Stanford Triangle Format. The similarity to the OBJ format becomes clear when you consider the structure of an ASCII representation. Points are defined step by step to be combined into flat polygons in a second step. The file format is also capable of providing additional information, for example, about colour, transparency or texture.

E57: E57 is a vendor-neutral format that allows not only point clouds to be stored very compact-ly, but also images and metadata. E57 files have a hierarchical tree structure. The sections contain-ing the metadata (e.g. sensor information) are encoded as XML. However, for reasons of efficiency,

234

most of the data (point data) is binary coded and not embedded in the XML sections. An advan-tage of the E57 data format over other formats (e.g. LAS) is its theoretically unlimited file size.

As already mentioned, the decision for one or the other point data format depends on the software and hardware that is to be or must be used. Especially among the software, many prod-ucts allow a variety of import and export formats. Furthermore, the conversion of a binary file into an ASCII file for archiving purposes can be considered. Otherwise, it can be concluded that with LAS and E57 two formats are available that can guarantee a high degree of interoperability. 3.1.3 Image data formats Camera systems that deliver data as digital images form another important group of sensors on UAVs for in environmental sciences. Their images record information over a broad electromag-netic spectrum. Chapters 2.4 and 2.5 provide a detailed description of imaging systems.

The development of image data formats has started decades ago and led to a broad range of different data formats with major or subtle differences. In general, two types of image data can be distinguished: RAW data formats and the so called “developed formats”.

RAW data is data recorded by the cameras’ electronic sensors and stored almost unchanged. Be-cause there is currently no uniform standard for sensor hardware RAW data formats differ between camera manufacturers. Today, raw data formats such as ARW (Sony), NEF (Nikon), CRW/CR2/CR3 (Canon) or RAF (Fuji) are in use. Approaches to find a common standard do exist. The digital negative (DNG) is considered an open data format but is still protected by license (Adobe Inc.).

“Developed formats” can be derived from raw data formats. Examples are the JFIF (JPEG File Interchange Format) or the TIFF (Tagged Image File Format) which are certainly the most used data formats. If you compare RAW data with, for example, data in JPEG format, some differ-ences occur that are well suited to show the advantages and disadvantages of one data format over the other. RAW data have a colour depth of 10 to 16 bits, i.e. they are able to distinguish radiometric differences in 1,024 to 65,536 brightness levels (per colour channel). JPEG, on the other hand, can only store 256 different brightness levels (8 bit). If the brightness or colour range in an image is high, losses in the radiometric quality of the images are inevitable. Often camera systems already include RAW data converters, which give the user the choice of one and/or the other format, but JPEG images may have already gone through pre-processing steps, such as white balance, tonal value corrections or noise reduction. The introduction of this chapter already pointed out the consequences of these pre-processing steps for the analysis of the image data. Especially when comparing different images of the same object, the differences could rep-resent real changes or are the result of such corrections. Already in the planning of UAV flights, it is therefore necessary to consider which data format is chosen to meet the requirements of

235

the task. If, for example, only point clouds (cf. SfM) or one-time geometric information is to be derived from the resulting images, this fact can play a rather minor role.

Nevertheless, JPEG also has advantages over RAW data formats. Data in JPEG format is very compact and much less storage-intensive than RAW data. With regard to the application of such data formats in camera systems on UAVs, however, this leads to a sometimes-significant increase in time required for saving an image. In the case of RAW data, this may well mean three times the saving time. However, this additional time has a direct effect on flight planning. In such a case, the UAV would have to fly much slower to completely record the desired area. In the above example (triple saving time), the power supply would only be secured for one third of the flight time and additional resources (battery, working time, etc.) would have to be provided.

An alternative to JFIF/JPEG is the lossless Tagged Image File Format (TIFF). TIFF allows a colour depth of up to 32 bits per colour channel. However, the conversion from a RAW data format may already include adjustments may by noise reduction or white balance procedures. Here again the already explained problems in data analysis can arise.

In the environmental sciences and especially in the use of UAVs, image data are usually re-quired in georeferenced form. Here, with the GeoTIFF, especially the TIF format offers a special format that combines the advantages of lossless compression with additional information for spatial reference. This includes the coordinates for georeferencing as well as information about the map projection and the coordinate reference system.

Finally, the decision for an image data format depends on the specific task. This is akin to the decision process for point cloud format but probably much simpler in case of raster data. GeoTIFF has become a de facto standard for the storage of raster data in recent years. It can be assumed, that all current software products and libraries for raster data processing can handle this type of data (reading and writing).

Finally, Table 3.1-1 provides an overview of the properties of the image data formats discussed. Table 3.1-1: Image data formats and their properties (selection).

* There is a difference between what the format specifications allow and what the software tools support. We have tried to focus on the widely used variants. 3.1.4 Decision checklist

In the preceding chapters, the advantages and disadvantages of the common data formats for point clouds and raster images were briefly reviewed. It became clear that a decision for one or the other data format will very often depend on the individual requirements. Nevertheless, the following checklist points out where the choice of data formats in the use of UAS matters.

1. The selection of the sensor often implies a decision for a vendor-specific data format. It should be checked whether the available formats are compatible with the available soft-ware for data processing or can be effortlessly converted without loss of information.

2. The speed of data storage should be considered when data must be recorded at a high vo-lumes or high repetition rate. As a rule of thumb, smaller data formats provide faster write speed, provided that the involved compression algorithms are efficiently implemented.

3. For the purposes of later data processing, care should be taken to ensure that fast data access is possible, especially if it is necessary to get a fast access to subsets of a data set. As data volumes grow, this aspect becomes increasingly important. If low latency, indexed

237

access, and high throughput are not provided by the original format, consider a conversion to a more capable format prior to any data processing steps.

4. If the thematic dimension of a data set is high or varies within the data, a format is required that allows such data structures . For raster data, for example, it would have to be checked whether a data format is required that allows multilayer structures.

5. If large amounts of data are recorded and produced, data compression (or storage effi-ciency) should be a concern. It is important to check if a data format can be selected that enables compression and whether a possible lossy compression has a negative influence on the further process of data analysis.

6. Especially in the field of environmental sciences, long-term observations and longitudinal studies are frequently conducted and should be anticipated in any UAS project. In such scenarios, when selecting a data format, care should also be taken to ensure that the for-mats can be expected to have long availability and tool support or can be considered hu-man readable, such as well-documented ASCII data. Alternatively, data curation strategies that involve converting the data to newer formats should be set up that ensure the data can be used after maybe more than 20 years.

7. When deciding for or against a data format, attention should be paid to its degree of adop- tion . When sharing the data with other people reusability is usually improved by picking a widely recognized data format with good tool support.

Since there are hundreds of data formats available which are optimized for different purposes, it is impossible to recommend “a best” format. Hence, we recommend sticking to this checklist and the discussed features of the to evaluate the strengths and weaknesses of a particular format in your application context.

238

3.2 Analysis of imagery – automatic extraction of semantic information

Claudio Persello and Caroline Gevaert

3.2.1 Image classification workflow ................................................................................................. 2383.2.2 2D image feature extraction ................................................................................................... 2403.2.2.1 Deriving textural features I: Grey-Level Co-occurrence Matrix (GLCM) ......... 2403.2.2.2 Deriving textural features II: Linear Binary Pattern (LBP) .................................. 2413.2.3 2.5D feature extraction ............................................................................................................ 2433.2.4 3D feature extraction ............................................................................................................... 2433.2.4.1 Spatial binning ............................................................................................................ 2433.2.4.2 Planar segments and shape attributes ...................................................................... 2443.2.5 Feature selection ....................................................................................................................... 2443.2.6 Supervised classification algorithms ...................................................................................... 2463.2.6.1 Support Vector Machine ........................................................................................... 2463.2.6.2 Random Forests .......................................................................................................... 2493.2.6.3 Deep learning classification and convolutional networks .................................... 2513.2.7 Accuracy assessment ............................................................................................................... 2543.2.8 Summary ................................................................................................................................... 256The advent and rapid development of radio-controlled platforms for aerial image acquisitions is offering new opportunities for the acquisitions of overhead imagery. UAVs allow us to perform acquisitions that can be easily repeated in time, over different geographical areas, and in dangerous conditions for human operators (e.g., after catastrophic events). As discussed previously in chap-

239

ter 2.2, multiple overlapping images acquired on the same area on the ground with different view-ing angles can be used to obtain three-dimensional (3D) information of the target area. Thanks to recent developments in photogrammetry and computer vision, state-of-the-art dense matching techniques can generate Digital Elevation Models (DEMs) and 3D point clouds with accuracies and densities, which were unexpected until recently (Hirschmüller, 2008). The extremely high spatial resolution of 2D images combined with 3D geometric information allows us not just to recognize a large and detailed set of thematic classes, but also to precisely characterize the objects in the area under investigation according to their material and geometry. In the context of urban studies, UAV data can be used to map buildings (Gevaert et al., 2018a), roads (Zhou et al., 2017), monuments (Fiorillo et al., 2013) as well as to detect damages after catastrophic events (Nex & Remondino, 2014) (e.g., earthquakes, flooding). In vegetation related studies, UAV data can provide estimations of biophysical parameters or detect early signs of plant stress or disease (Nex & Remondino, 2014). The capability to extract such information, with a flexible and relatively cheap acquisition process, is opening new opportunities, including 3D urban modelling, damage assessment and recovery action planning, precision agriculture, mapping of informal settlements (Gevaert et al., 2017) and cadas-tral boundaries (Xia et al., 2019). For all these applications, automated image analysis techniques, capable of extracting semantic information efficiently and accurately, are essential. In this domain, machine learning techniques play a fundamental role. In particular, supervised classification algo-rithms, which are able to learn how to classify images from a set of training samples. Unsupervised algorithms such as clustering and segmentation do not require training data. They are used for separating different objects; however, they cannot assign them class labels.

This chapter presents an overview of supervised classification strategies to extract semantic information from UAV data. The focus is on the automated classification of UAV data, con-sidering the main processing steps of a classical workflow based on supervised learning algo-rithms (chapters 3.2.1–3.2.6). An overview of the deep learning approach, recently becoming popular thanks to the excellent feature extraction capabilities of convolutional networks, is given in chapter 3.2.6.3. Chapter 3.2.7 covers classification accuracy assessment aspects, and chap-ter 3.2.8 closes the chapter with a short summary. 3.2.1 Image classification workflow

The classical workflow for the classification of UAV images, also known as land-cover or land-use classification in the remote sensing literature (Tong et al., 2020), consists of a sequence of processing steps. In computer vision, this task is commonly called semantic segmentation (Long et al., 2015). The output of this process is a thematic map, where each pixel is labelled according to a predefined set of classes, e.g., land-cover categories.

240

The classification can be performed on a per-pixel basis or per region. In the first case, pixels are considered as the atomic elements for the classification process. In the second approach, of-ten referred to as object-based image analysis (OBIA) (Blaschke, 2010), the image is firstly divid-ed into homogeneous regions through a segmentation algorithm. The segments (and therefore all pixels therein) are then classified according to user-defined rules or a supervised classification algorithm. In both cases, one of the fundamental points to obtain accurate classifications is the extraction of the spatial information that characterize the neighbourhood of individual pixels. This is fundamental for the analysis of extremely high-resolution imagery acquired from UAVs, where the objects of interests (e.g., buildings, roads) are typically much larger than the pixel size.

A diagram of the general classification workflow is reported in Figure 3.2-1. The first step involves all necessary pre-processing operations aimed at correcting geometric and radiometric distortions and includes the application of photogrammetric techniques to derive an orthorec-tified image, a digital elevation model (DEM) and a point cloud. This fundamental processing phase is important to obtain high-quality input data for the semantic analysis. This step was de-scribed in detail in chapter 2.2. The second stage involves the extraction of informative features for the classification of the input image. It involves the extraction of spatial-contextual features, which capture radiometric, textural and geometric information from the neighbourhood of the individual pixel. The extraction process can operate in a moving window manner or on the basis of homogeneous regions obtained by segmentation. This step is important to enhance the discrimination ability of the classes and to obtain accurate classification by considering the spa-tial relations between pixels. Several techniques have been proposed in the literature to extract (2D) textural features (Haralick et al., 1973; Ojala et al., 2002), 2.5D and 3D contextual features (Weinmann et al., 2015). Texture feature extraction based on Grey-Level Co-occurrence Matrix (GLCM) and Linear Binary Pattern (LBP) is presented in chapter 3.2.2; 2.5D and 3D contextual features extraction are covered in chapters 3.2.3 and 3.2.4, respectively. The feature extraction phase may result in the extraction of a large number of potentially discriminative characteristics, not all of them being relevant for the supervised classification task. To remove redundant and non-informative features, a feature selection is commonly adopted to identify a subset of the most relevant features for the problem at hand (chapter 3.2.5). Finally, the third step is about the supervised classification. A classification algorithm is used to translate the features extracted in the previous step into a thematic map representing the spatial semantic information of the area under investigation. The focus of this chapter is on supervised classification algorithms, which require the availability of labelled samples for training the classification model. Popular algorithms are based on machine learning techniques like Support Vector Machine (SVM) (Bru-zzone & Persello, 2009) or Random Forests (RF) (Belgiu & Drăguţ, 2016) which can derive accu-rate classification from a set of heterogeneous features as input (chapter 3.2.6). Chapters 3.2.2–3.2.6 will enter into the details of feature extraction, selection and classification of UAV data.

241

Figure 3.2-1: Diagram of the classical image classification workflow. Unless otherwise stated, all images were prepared by the authors for this chapter. 3.2.2 2D image feature extraction

2D feature extraction from UAV imagery is quite similar to feature extraction from satellite im-agery. In this chapter, a feature is considered to be a variable in an n-dimensional feature space such as radiometric and texture features:

Radiometric features consist of the spectral bands of the UAV sensor, ranging from RGB to multispectral to hyperspectral and thermal, as well as derivatives of these bands such as vege-tation indices. Other chapters of this book (chapter 2.4, 2.5 and 4) provide the reader with an overview of these features linked to specific applications.

Texture features can provide important supplementary information. For example, in urban settings, UAVs mounted with only RGB cameras may not have the spectral resolution to distin-guish between green roofs and vegetation. Texture features have proven to be useful in such ex-amples where the radiometric resolution of the imagery is not sufficient to distinguish between classes. Most texture features work with single-band images, so the first step is usually to convert a colour image into a grayscale image. Then, textures are identified by comparing the intensity of pixel values within a defined neighbourhood. Two main types of texture features are the GLCM and the LBP. 3.2.2.1 Deriving textural features I: Grey-Level Co-occurrence Matrix (GLCM)

Textural features can be extracted from a GLCM matrix. GLCM matrices describe how often a defined intensity combination (i.e. Grey-Level) of adjacent pixels occurs (i.e. Co-occurrence) (Haralick et al., 1973) within a moving window or kernel. Pixel adjacency is generally under-

242

stood as horizontal, vertical, left-diagonal, and right-diagonal neighbours described by an off-set ( i, j ). The user will define these offsets and construct a GLCM matrix for each combination individually. The image is first re-quantized to N number of grey-levels . The resulting GLCM matrix P will have a dimension of N x N . Higher values of N increase the computational com-plexity and processing requirements of the GLCM features. So, in practice, values of up to 64 are generally used. Texture patterns in UAV imagery can be oriented in different directions. For example, the linear texture created by roof material may sometimes be oriented North-South and sometimes East-West. The aim is to identify the roof texture regardless of which direction the roof is oriented. In other words, a good texture feature in remote sensing should have rotational invariance . A certain degree of rotational invariance can be introduced into GLCM features by normalizing the matrices for the different offset directions. The user will also select the kernel size, which can have a large impact on the results. A kernel that is too small does not cover enough of the image to capture dominant texture patterns. A kernel size that is too large will cover too many different texture patterns in the image and make it difficult to distinguish between them.

Textural features can be calculated from the GLCM matrix P . 28 features were originally pro-posed (Haralick et al., 1973), such as homogeneity, contrast, dissimilarity, entropy, angular sec-ond moment, mean, standard deviation, and correlation. These statistical measures are known as Haralick features. Many (remote sensing) image processing software have the option to cal-culate GLCM texture features based on a user-defined kernel size, offset, and Haralick features. Unfortunately, it is difficult to know before-hand which kernel size will be optimal or which Haralick features best describe the texture patterns in the image that is being classified. In prac-tice, different combinations of kernel sizes and texture features are tested in order to select the ones that obtain the highest classification accuracies. 3.2.2.2 Deriving textural features II: Linear Binary Pattern (LBP)

LBP texture features are less computationally intensive than GLCM and are rotationally invari-ant. LBP features are computed by selecting a total of N neighbours evenly distributed in a circle with a radius R from a central pixel (Ojala et al., 2002). A binary code with a length of N bits is obtained by comparing each neighbour to the value of the central pixel. The digit is assigned a value of one if the intensity of the neighbour is higher than that of the central pixel, and a value of zero if it is lower. Figure 3.2-2 displays two examples of LBP patterns calculated from UAV imagery considering eight neighbours ( N =8) at a radius of one pixel ( R =1). If we take the bottom pattern as an example, the first, second and seventh neighbours have a higher intensity value than the central pixel, so this corresponds to a code of 11000010. Rotational invariance can be

243

obtained by applying bitwise rotation, or circular shift until the lowest binary value is obtained. The code 11000010 would then be transformed into 00001011. The next step is to introduce a “uniform” pattern by counting the number of transitions from 1/0 and 0/1 in the rotationally invariant code. For example, 00001011 will change into 3. Ultimately, this process assigns a val-ue between one & N +2 to each pixel (the extra two representing only 0s or only 1s), where the number represents a uniform and rotationally-invariant texture pattern.

Figure 3.2-2: Example of extracting LBP texture features (R=1, N=8) from a UAV image.

For aerial image classification applications, it is practical to calculate the LBP texture pattern for different values of R and N . In practice, useful combinations for [ R,N ] are [1,8], [2,16], and [3,24]. The inclusion of more neighbours (=higher values of N ) is computationally inefficient.

Each [ R,N ] combination results in a raster with codes indicating the texture pattern surround-ing the pixel. This raster tends to be very noisy, with many different texture patterns observed by neighbouring pixels. Rather than using the raster of LBP codes directly as classification features, it is common practice to calculate the relative frequency of the LBP codes over certain image segments. This is done by first selecting an image patch (using either a moving window or image segmentation techniques), and then computing the normalized histogram giving the frequency of each LBP code within that patch. In this way, noise and small artefacts are removed and the feature used for classification will consider texture patterns over a larger area.

Note that LBP textures only consider whether a neighbour is higher than the central pixel, but not by how much. The variance of the neighbours N is therefore usually included as a sup-plementary feature to the LBP code. The final textural features used for image classification will

244

then be the relative frequency of the LBP codes and variance features for each [ R,N ] texture feature. 3.2.3 2.5D feature extraction

2.5D (topographic) features refer to features extracted from a Digital Surface Model (DSM) or Digital Terrain Model (DTM) (see chapter 5.4 for a description of DSMs and DTMs). These fea-tures give an indication of the elevation characteristics of objects in the scene but do not take the full 3D geometries into account. For example, morphological filters applied to DSMs can be used as topographic features for UAV scene classification. This is especially useful for scenes where it is difficult to obtain the DTM due to the limited availability of points on the ground or scenes with very steep topography. Various studies have demonstrated the utility of using morphologi-cal top-hat filters applied to the elevation data (Arefi & Hahn, 2005; Mongus et al., 2014). These filters provide information regarding the height of a pixel compared to the neighbouring pixels which fall within a user-defined structuring element. A multi-scale topographic feature set can be constructed by applying top-hat filters with structuring elements of various sizes to the DSM obtained from the UAV. This provides information regarding the height of an object compared to its neighbours, and the utilization of multiple structuring elements provides an indication of the expected size of the object. 3.2.4 3D feature extraction 3.2.4.1 Spatial binning

3D features are computed directly on the point-cloud (opposed to 2.5D features which are cal-culated from the DSM or DTM). The benefit of full 3D features is that they can capture more detailed 3D information of the scene in question. However, the 3D features must be converted into 2D raster in order to be combined with the 2D and 2.5D features from the previous section.

One of the simplest ways to do this is through spatial binning, sometimes known as elevation images. First, the “spatial bins” are constructed by creating a grid where the boundaries align to the geographical coordinates of the 2D image pixels. Features can be obtained by determining the: total number, maximal height difference, and standard deviation in the height of all 3D points corresponding to each bin. Note that if this results in many empty bins (due to a low point cloud density), a larger spatial resolution should be selected or smoothing techniques should be used.

245

3.2.4.2 Planar segments and shape attributes Another way to convert complex 3D shape attributes to a 2D raster is to calculate the shape attribute for the highest point in each spatial bin. As the UAV captures imagery from above, it makes sense that the radiometric information in an image pixel will correspond to the object represented by the highest point in the point cloud for the spatial bin corresponding to this pixel. 3D shape attributes can be approximated by first defining a neighbourhood (either limited by size or a maximum number of neighbours) around a point in the point cloud. The normalized eigenvalues of the matrix constructed by the X, Y, Z coordinates of these points provide an in-dication of the shape of this neighbourhood (Chehata et al., 2009). For example, a linear surface will have a very large primary eigenvalue and relatively small secondary and tertiary eigenvalue. A planar surface will have a large primary and secondary eigenvalue, but much smaller tertiary eigenvalue. Various studies provide overviews to such geometric shape attributes which can be obtained from point clouds based on local neighbourhood’s (Demantké et al., 2012; Weinmann et al., 2015). In some cases, planar features extending over much larger areas than can easily be represented by local neighbourhoods in high-density UAV point clouds. For example, flat ter-rain in a field or large roof surfaces. Large planar segments can be extracted more easily through surface-growing algorithms (Vosselman, 2013). Features can then be extracted from each planar segment, such as the number of points per segment, average residual to the segment, inclina-tion angle of the segment, and the maximal height difference between the segment and directly neighbouring points. More methods regarding the analysis point clouds, such as segmentation and classification, can be found in chapter 3.5. 3.2.5 Feature selection

The previous sections describe a wide range of features that can be extracted from UAV data. Al-though more features can provide more detailed information to capture the differences between classes, the additional complexity may actually reduce the classification accuracy. This is known as the “curse of dimensionality”, or Hughes phenomenon. It is especially a problem when limited training data is available. Feature selection is an effective way to mitigate the “curse of dimen-sionality” and limit unnecessary data processing. It reduces the total number of features to a se-lected set which provides the most discriminatory information for the classification task at hand. Feature selection methods are made up of a search strategy and a criterion function (Richards, 2013; Persello & Bruzzone, 2016). The search strategy is the method which the algorithm uses to select different subsets of features. The criterion function allows you to rank the different subsets and select the set of features which has the highest performance for your classification problem.

246

There are three main types of feature selection methods: filter, wrapper, and embedded methods (Chandrashekar & Sahin, 2014). Filter methods use statistical metrics to define the dependence of class labels on each individual feature in the feature sets (i.e. the criterion function) and then rank the features in order of importance (i.e. the search strategy). Benefits of filter methods include fast computation and independence from the classification method used. However, this method may sometimes select two features which have a very high correlation to the class label, but also to each other. This is known as feature redundancy because the two selected features contain redundant information. Feature redundancy can be solved using search strategies such as Sequential Forward Selection (SFS) and Sequential Backward Selection (SBS). SFS starts in the same way by selecting the most important feature. The difference is when ranking the sub-sequent features. Rather than assessing the information contained in each feature separately, SFS adds each remaining feature to the selected features individually. Each time, it calculates the improvement in the statistical importance metric. It then adds the feature which has enhanced this important criterion the most. This process is repeated to select the remaining features. SFS thus avoids feature redundancy by answering the question: which new feature will improve my set of selected features the most? SBS works in the opposite direction. It first calculates the cri-terion function on the entire set of features. It then removes a feature from the set of selected features and calculates the decrease in the criterion function. It does the same for each feature. The feature which causes the smallest decrease is considered to be the least important and is removed from the set of selected features. The same process is followed to keep removing the least important features. Both SFS and SBS are sub-optimal search strategies, meaning that they do not test each and every feature set combination, but they are computationally more efficient. These heuristics can be used to find a good subset of features, although it may not strictly be the optimal one. Criterion functions for filter methods may include divergence, the Jeffries-Matusita (JM) Distance, and transformed divergence (Richards, 2013).

Wrapper methods recursively perform the classification with different feature subsets and use the classification accuracy as the criterion function to identify important feature subsets (Guyon et al., 2002). The largest disadvantage of wrapper methods is their considerable compu-tational cost. Therefore, literature often uses a hybrid model which first employs a filter method to remove irrelevant or weakly associated features, and then uses a wrapper method to identify the optimal subset. Embedded methods include feature selection in the training process of the classifier, and features are ranked according to their contribution to the model. Recursive Feature Elimination for Support Vector Machines (SVM-RFE) is a common example. It uses the weight vector w to rank important features (see details on the SVM classifier in the next section).

247

3.2.6 Supervised classification algorithms

Supervised classification algorithms aim to learn a general mapping rule, i.e., a partition of the feature space to assign a class label to an input pattern (feature vector). The learning process takes advantage of a set of labelled examples named training set. In semantic segmentation, the input patterns are commonly associated with individual pixels, although it is also possible to consider image segments according to an object-based image analysis approach. In the latter case, features are extracted per segment, which are then considered as input vectors for the supervised classification. In both cases, the spatial-contextual information is captured during the feature extraction phase preceding this step. The output is a thematic map, where each pixel of the input image is assigned to one of the predefined class labels. Pixels are commonly mapped according to multi-class land-cover or land-use classes, but depending on the application, they can be classified according to a specific label set. For example, in a damage assessment application, pixels are labelled as “damaged” or “not damaged” by solving a binary classification problem. Many detection problems can be modelled as a binary classification problem, e.g., building detection, road detection, flood mapping, change detection, boundary delin-eation. Several supervised algorithms have been explored in the remote sensing literature, including 1) Gaussian Maximum Likelihood (GML) (Paola & Schowengerdt, 1995), 2) Artificial Neural Net-works (ANN), also known as Multi-Layer Perceptron (MLP) (Benediktsson et al., 1990), 3) Deci-sion trees (Pal & Mather, no date), 4) RF (Gislason et al., 2006) and, 5) SVM (Gualtieri & Chettri, no date; Cortes & Vapnik, 1995). GML is a probabilistic classifier that adopts a parametric model for the distribution of the classes (for this reason called parametric), more specifically the normal (Gaussian) model, resulting in quadratic decision surfaces in the feature space. ANNs, RF and SVM are non-parametric (or distribution-free) classifiers; that means that they do not require an explicit assumption on the distribution of the classes. An important type of ANNs, specifically designed for image analysis, are Convolutional Neural Networks (CNNs). The remainder of this section will focus on popular non-parametric techniques: SVM, RF, deep learning, and CNNs. 3.2.6.1 Support Vector Machine

SVM implements a binary classification strategy that exploits a geometrical criterion rather than a statistical one. In other words, SVMs do not estimate the statistical distributions of classes to carry out the classification task, but they derive the model by exploiting the concept of margin maximization. The success of SVMs in many applications, including remote sensing and UAV image analysis, is rooted in a number of attractive properties (Burges, 1998; Vapnik, 1998; Cris-tianini & Shawe-Taylor, 2000; Schölkopf & Smola, 2002):

248

1. A non-parametric approach that does not require an explicit assumption on the distribu-tion of the classes, unlike probabilistic techniques such as GML;

2. A discriminative strategy, which does not explicitly estimate the distribution of the classes, but focuses on deriving the optimal decision boundary directly;

3. An effective approach to improve the generalization ability (i.e., the ability to classify un-seen data correctly) based on a regularized loss function (also called the structural risk minimization principle);

4. The possibility to solve non-linear separable classification problems by implicitly projec-ting the data into a high dimensional feature space and separating the data with a simple linear function.

Let us consider the problem of pixel-wise classification of a generic image of size I×J pixels. We assume that a training set of N pairs ()=1, Nii iyx is available, where x i are feature vectors associated with pixels (or segments) and y i are the corresponding labels. For the sake of simplicity, we focus here on the two-class case, while multi-class problems can be solved by combining multiple binary classifiers. Accordingly, let us assume that +1, 1iy is the binary label of the pattern x i

. The goal of SVM is to divide the d -dimensional feature space into two subspaces, one for each class, through a separating hyperplane ( )=+ =:0H fbxw x. The decision rule used to find the membership of a test sample is based on the sign of the discrimination function f ( x ). Therefore, a generic test pattern x is labelled according to the following rule: ( ) ( )= +

=

0 1

0 1xxxx

(1)The training of an SVM consists of finding the position of the hyperplane H , estimating the values of the vector w and the scalar b , according to the solution of an optimization problem. From a geometrical point of view, w is a vector perpendicular to the hyperplane H and thus de-fines its orientation. The soft-margin training algorithm, designed to handle data which are not linearly separable, consists in minimizing a cost function (also called loss function) expressed by the combination of two criteria: 1) margin maximization, and 2) error minimization: ( ) 1

1,,2NiibC=

ψ=+∑wwξ(2)where ξ i are so-called slack variables, which control the penalty for the misclassification of train-ing samples (see Figure 3.2-3 for an illustrative example). C is a regularization parameter that controls the penalty associated with errors, and thus controls the trade-off between training

249

errors (empirical risk) and the width of the margin (generalization ability). If the value of C is too small, many errors are permitted, and the discriminant function will poorly fi t the data (underfi tting); on the opposite, if C is too large, the classifi er may overfi t the data instances, thus resulting in low classifi cation accuracy on the test set, i.e., unseen data. Careful tuning of the C value is crucial and should be derived through an accurate model selection phase. Th e minimi-zation of the cost function (2) is subject to the following constraints: ()ξ+=iy 1 , 1,2, ,ibiNiw x ξ=0, 1,2, ,iiN

(3)resulting in a quadratic optimization problem subject to inequality constraints. It is diffi cult to solve this optimization problem directly; therefore, the Lagrange theory is usually applied to transform it into a dual formulation (Cristianini & Shawe-Taylor, 2000).

Figure 3.2-3: Illustrative example of the SVM discriminant function for a binary classifi cation task.

250

Solving this optimization problem leads to the linear SVM classifier. However, one of the main advantages of SVM is the possibility to extend it to non-linear discriminant functions by means of an elegant mathematic expedient. Instead of using more complex discriminant functions, the input data are projected into a high dimensional feature space where a linear function can better separate the transformed samples. This is done by replacing the inner product in the mathemat-ical formulation of the problem with a kernel function defined as: ()( )( )==, ,1, , ,ijijKi jNx xxx(4) calculating implicitly the inner product in the transformed space.

Once solved the problem (in the dual form) with respect to the Lagrange multipliers α i the discrimination function becomes:

( )( )( )

==+ =+∑∑ i i 1 ,,, Ni ii iii SV fyKbyKbxx xx x(5) where SV is the set of support vectors , i.e., the training samples associated with α i

>0. The SVM solution is sparse in the sense that only a subset of the training samples, i.e., the support vectors, contribute to the definition of the membership function. 3.2.6.2 Random Forests

Ensemble classifiers are classification algorithms that are based on a number of individual super-vised classifiers. Random Forests are ensemble classifiers made up of individual Classification and Regression Trees (CART) (Breiman, 2001) (explained in more detail in the next paragraph). Each CART in the RF is trained by using a random subset of the training data (Figure 3.2-4). A bagging approach randomly selects the training data subset for each CART through boot-strapping (i.e. random sampling with replacement). A boosting will give training samples which are difficult to classify in one CART a higher likelihood to be selected to train the next CART. Benefits of using ensemble classifiers like RF is that, by aggregating the results of the individ-ual CARTs, a greater accuracy can be achieved and the classifier is more robust to noise in the training samples. Due to the high accuracies obtained by RF classifiers and the ease and speed of training, RF has become a popular classification method in the remote sensing community (Belgiu & Drăguţ, 2016).

251

Figure 3.2-4: Illustrative example of a Random Forest classifier constructed from three CARTs using a bagging approach.

Each individual CART is made of nodes and leaves. The first node (the root node) generally con-tains about two-thirds of the N training samples (though this depends on the specific software package and can sometimes be altered). The remaining training samples are retained as a vali-dation set, known as the out-of-bag samples. Each node in the CART splits the training samples into two groups based on a selected number of features with the aim of increasing the homo-geneity or purity of the two descending nodes. Consecutive nodes, therefore, split the group of training samples into increasing degrees of purity, until the final leaves of the tree which assign a class label to the training samples. The degree of purity can be calculated by the Information Gain or the Gini Index (6). ( )2 kkGp==∑(6)Where G is the Gini index, c is the total number of classes, and p k is the relative frequency of class k in the samples present at that given node. A Gini index of one indicates a low homo-geneity and equal distribution of classes within the node, and a Gini index of zero indicates the presence of only a single class within the node. During the training phase, the Gini index of each feature in a parent node and two child nodes is calculated. The feature which causes the largest decrease in the Gini index between the parent and child nodes (in other words,

252

the feature which increases the purity of the child nodes the most) will be selected. The user must define a number of parameters when training an RF. The first parameter is the number of individual trees to train in the ensemble classifier. Sensitivity studies for remote sensing applications show that this parameter generally does not have a big effect on the classification accuracies. The second parameter is the number of features to randomly select and present to each individual node during the training phase. This seems to have a stronger influence on the classification results. A good rule of thumb is to set it to the square root of the number of input variables (Gislason et al., 2006; Belgiu & Drăguţ, 2016). Finally, the user can also set the maximum number of samples allowed in each leaf. Allowing more samples per leaf tends to create smaller trees and smoother results. Fewer samples per node will cause more heteroge-neous results. 3.2.6.3 Deep learning classification and convolutional networks

The classification workflow described above is based on the extraction of spatial features spe-cifically designed to address the problem at hand. The corresponding methods depend on sev-eral free parameters, which are usually set according to user experience or by trial and error. An exhaustive optimization of the parameter values is computationally expensive, especially when large spatial neighbourhoods need to be considered as is the case of high-resolution UAV images. Moreover, traditional feature extraction techniques are unsupervised, which means that the extraction is not guided by the specific classification task through a supervised learning process. Deep learning networks can partly overcome the above-mentioned issues by automatically learning spatial features from the input data (Zhu et al., 2017). Deep ANNs are computational models, where the input data is gradually transformed through a sequence of processing layers that extract intermediate features and finally predict the target output (LeCun et al., 2015). In a supervised setting, the network is trained with a set of training data, exemplifying the functional relationship between input and output. The training is an iterative process that tunes the free parameters of the network to minimize a cost (or loss) function. The procedure for training ANNs is based on the backpropagation algorithm (LeCun et al., 1998) and the most common technique is called stochastic gradient descent (SGD) (Wein-mann et al., 2015).

CNNs are a type of ANNs, which are specifically designed for image analysis (or any data that come in the form of multiple 2D arrays). As other ANNs, they are composed of a sequence of processing layers that perform an affine transformation of the input data followed by a non-linear activation function. The main building blocks of CNNs are: 1) 2D convolution (see chapter 3.3), 2) an activation function, and 3) spatial pooling. The weights of the convolution operations

253

are shared at each pixel location and learned through a supervised learning process aimed at minimizing the classification error. The activation function is a non-linear transformation, such as the sigmoidal function or the linear rectifier (Nair & Hinton, 2010). The pooling performs spatial aggregation by taking the average or the maximum value of the image in non-overlapping windows of fixed size (e.g., 2×2). Standard architectures use a sequence of convolutional lay-ers to extract feature maps interleaving the main three processing operations described above. Through progressive pooling operations, the feature maps are then flattened into a 1D vector and fed to a fully connected network, which corresponds to a conventional ANN. Figure 3.2-5 shows the architecture of a popular CNN architecture, named VGGNet after the name of the research group that developed it (Simonyan & Zisserman, 2015). The convolutional layers are responsible for learning the spatial features, whereas the fully connected layers learn the classifi-cation rule to be applied to the extracted feature vector. The network is trained in an end-to-end fashion; hence, feature extraction and classification occur simultaneously in a single supervised learning algorithm. This approach has shown to be effective in various computer vision tasks, including multimedia image classification, where one label is assigned to the entire input scene. Deep CNNs have been successfully applied to image categorization benchmarks such as the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) (Russakovsky et al. , no date), considerably outperforming techniques based on hand-crafted features.

Figure 3.2-5: Architecture of VGGNet (Simonyan & Zisserman, 2015).

Source (Bezdan & Bačanin Džakula, 2019).

254

CNNs have also been adapted to perform pixel-wise image classification. The standard patch-based approach consists in training the CNN to label the central pixel of patches extracted from the input image (Bergado et al., 2016). This, however, results in redundant processing at infer-ence time and therefore in high computational cost when applied to large RS images. Currently, the most effective architectures are the so-called Fully Convolutional Networks (FCNs), which are trained to infer pixel-wise labels of the entire input image. In these networks, the fully con-nected layers are usually substituted by one or multiple layers that up-sample the feature maps (e.g., by applying bilinear interpolation or transposed convolutional filters) extracted by the con-volutional layers to the resolution of the input image (Long et al., 2015; Noh et al., 2015; Ron-neberger et al., 2015b; Badrinarayanan et al., 2017). Long et al., (2015) adapted contemporary CNNs into FCNs and fine-tuned them to address semantic segmentation. More recent networks use an encoder-decoder structure (see Figure 3.2-6), using various strategies for up-sampling the feature maps learned by the encoder to the resolution of the input image (Noh et al., 2015; Ronneberger et al., 2015a; Badrinarayanan et al., 2017). An alternative approach is to use no-down-sampling networks employing dilated convolutional filters as in (Yu & Koltun, 2016; Per-sello & Stein, 2017). Figure 3.2-7 shows the classification workflow based on a deep learning approach: FCNs allow us to merge the two traditional distinct steps of feature extraction and classification into one optimized processing step.

Figure 3.2-6: Architecture of SegNet (Badrinarayanan et al., 2017), popular encoder-decoder fully convolutional network for semantic segmentation.

255

Figure 3.2-7: Diagram of the deep learning classification workflow based on a fully convolutional network. Please note the difference to Figure 3.2-1: Feature learning and classification are merged into one single supervised algorithm.

3.2.7 Accuracy assessment

The final step is to assess the accuracy of the supervised classifier. The accuracy will determine whether the map is fit-for-purpose and acceptable for the intended application. It is conducted by comparing the results of the classifier with reference data. This reference data represents the actual class label on the ground. It can be obtained from fieldwork or other thematic maps or remote sensing imagery sources. It is important to ensure that there is no spatial or temporal shift between the reference data and the map data. For example, land-use changes could occur between the date of field data collection and the UAV image capture. The model may appear to contain false errors if this field data is then used to assess the accuracy of a classification model based on the UAV imagery. In practice, the spatial detail of UAV imagery is so high that visual interpretation can often be used to manually digitize reference data on top of the imagery.

A suitable set of reference data, also known as testing data, must satisfy a number of char-acteristics. Firstly, it must represent all of the classes targeted by the supervised classification algorithm. The number of samples per class is still under research, but in general, the more, the better. Some sources recommend 30 to 60 samples per class, though the supervised classification of UAV imagery can easily result in hundreds of samples per class. Ideally, the reference data should also be balanced. That is to say that each class is represented by approximately the same frequency. However, this is often difficult in practice as some classes will be much more abun-dant in the imagery than others. A practitioner may therefore need to make a selection of the samples to ensure that the classes are more balanced.

Reference data should be collected according to an adequate sampling design strategy. Ran-dom sampling distributes the number of sample points over the study area ad hoc. Systematic sampling distributes the points evenly over the study area in a grid-like pattern. Random and

256

systematic sampling are easy to implement and result in unbiased reference data. Disadvantages include that they may select areas that are difficult to visit in the field or observe in the image, and that rare classes might be omitted from the sampling. Stratified sampling ensures that each class is represented in the reference data. This sampling design strategy selects a number of testing samples per class depending on the relative frequency of that class in the output map (Warner et al., 2009).

Once reference data is selected, we can proceed to assess the accuracy of the supervised clas-sification. This is commonly done with the confusion matrix, also known as an error matrix, which compares the reference labels with the labels predicted by the supervised classifier. For example, cell c i,j will give the frequency of testing samples with the class label i in the predicted map and class label j in the reference data. Table 3.2-1 displays an example of a confusion matrix for binary classification.

Table 3.2-1: Example of a confusion matrix with multiple classes.

Various accuracy metrics can be extracted from the confusion matrix. The diagonal of a con-fusion matrix indicates the number of correctly classified samples. These cells represent true positives , because the thematic map correctly predicts the reference class ( i = j ). The other cells in the matrix represent samples that were misclassified. For example, samples in cell c 1,2 are false positives as the thematic map predicts they are class 1, but the reference class is actually 2. The overall accuracy (OA) of a classification is the proportion of correctly classified pixels and can be calculated by taking the sum of the diagonal divided by the total number of testing samples (7). The user’s accuracy (UA) is the probability that a pixel in the reference map is actually that class on the ground (8). This is sometimes also known as correctness or precision. The producer’s accuracy (PA) is the probability that a reference sample has been correctly classified by the algo-rithm (9). This is also known as the completeness, or recall. It is common to provide the total UA and PA averaged over all thematic classes. However, these average metrics can be misleading if

257

the classes are unbalanced. Therefore, the F1-score is often presented as it includes both the UA and PA (10) (Warner et al., 2009). Sometimes you will see the kappa coefficient used to report classification accuracies in remote sensing studies. However, there are convincing arguments that this not appropriate because, e.g. it reports the overall agreement above chance agreement (yet chance agreement is not relevant for remote sensing classification problems) and it is diffi-cult to interpret kappa values (Foody, 2020). 1/ciiiOAcN==∑(7) UA ( f or class i ) = c ii

/c i+

(8) PA ( f or class i ) = c ii

/c i+

(9)

3.2.8 Summary

This chapter presented an overview of the most common techniques used in the analysis of UAV images to produce a thematic map, i.e., associating a semantic label to each pixel of the image. Considering the high spatial resolution that can be achieved by UAV data, one of the fundamental challenges of the last decade has been to characterize the spatial-contextual in-formation and extract discriminative features for the automated classification. We have seen that the classical approach requires to “handcraft” those spatial features considering 2D, 2.5D, and 3D information, which can be a cumbersome procedure. The deep learning approach simplifies the workflow by learning those features directly from the training data, assuming that enough labelled data is available. This approach proved very effective in many applica-tions and is going to play a fundamental role in the research as well as in the operational use of UAV images for mapping purposes. Nevertheless, the high computational requirements and the need for large training data may limit the use of deep learning. It has been highlighted the role of convolutional networks designed for pixel-wise labelling, i.e., fully convolutional networks. Recent literature has shown that these networks can be applied to a large number of applications, including the extraction of DTMs (Gevaert et al., 2018b), delineation of agricul-tural boundaries (Persello et al., 2019), cadastral boundaries (Xia et al., 2019), and mapping of urban areas (Persello & Stein, 2017). We expect that in the near future, advanced computer

258

vision methods for semantic segmentation and object detection will become more and more popular and will find a number of applications in the extraction of semantic information from UAV data.

References for further reading

260

3.3 Image sequence processing

Anette Eltner, Salvatore Manfreda and Borbala Hortobagyi

3.3.1 Image pre-processing ............................................................................................................... 2603.3.1.1 Image ortho-rectification .......................................................................................... 2603.3.1.2 Image co-registration ................................................................................................. 2613.3.1.3 Image filtering ............................................................................................................. 2623.3.2 Feature-based tracking ............................................................................................................ 2633.3.3 Patch-based tracking ............................................................................................................... 2653.3.3.1 Tracking in the spatial domain ................................................................................. 2653.3.3.2 Tracking in the frequency domain ........................................................................... 2673.3.3.3 Improving robustness and accuracy ........................................................................ 2673.3.4 Tracking strategies ................................................................................................................... 2683.3.5 UAV monitoring applications ................................................................................................. 2703.3.5.1 Streamflow................................................................................................................... 2703.3.5.2 Landslide ..................................................................................................................... 2713.3.5.3 Glacier .......................................................................................................................... 271Measuring object displacement and deformation in image sequences is an important task in remote sensing, photogrammetry and computer vision and a vast number of approaches have been introduced (Leprince et al., 2007; Alba et al., 2008; Debella-Gilo & Kääb, 2011). In the field of environmental sciences, applications are, for instance, in the studies of landslides, tectonic displacements, glaciers, and river flows (Manfreda et al., 2018). Tracking algorithms are vastly utilized for monitoring purposes in terrestrial settings and in satellite remote sensing, which need to be adapted for the application with UAV imagery because resolution, frequency and perspective are different. For instance, geometric and radiometric distortion need to be minimal

261

for successful feature tracking, which can be a large issue for UAV imagery in contrast to satellite imagery with much smaller image scales (Gruen, 2012).

Using UAV systems for multi-temporal data acquisition as well as capturing images with high frequencies during single flights enables lateral change-detection of moving objects. And if the topography is known, a full recovery of the 3D motion vector is possible. The underlying idea is the detection or definition of points or areas of interest, which are tracked through consecutive images or frames considering the similarity measures.

In this chapter, pre-processing steps to successful image tracking and vector scaling are intro-duced. Afterwards, two possible strategies of tracking, i.e. feature-based and patch-based, are explained. Furthermore, different choices of tracking in image sequences are discussed. And finally, examples are given in different fields. 3.3.1 Image pre-processing

UAV image sequences can be either acquired during multiple flight campaigns to observe phe-nomena evolving at slow rates, e.g. landslide monitoring or during a single campaign focusing on faster change rates, e.g. lava or river flows. In both cases, information about the terrain has to be considered to calculate scaled motion vectors (chapter 3.3.1.1). Thereafter, frame co-regis-tration is necessary for precise tracking of objects. This step becomes more critical when image sequences of high frequencies are captured (chapter 3.3.1.2). Finally, image filtering may be re-quired to increase the robustness of image tracking (chapter 3.3.1.3). 3.3.1.1 Image ortho-rectification

It is important to account for impacts of camera perspective and relief to avoid false scaling of tracking vectors. The objective is the projection of the original image, which might be captured from oblique viewing angles looking at unlevelled terrain, into an image plane to calculate a dis-tortion-free photo where the scale remains constant (Figure 3.3-1). Without this transformation, correct measurements would solely be possible if a planar terrain is captured from nadir view. To achieve the conversion from central projection, i.e. lines of projection intersect at one point (projection centre), to parallel projection, i.e. lines of projection are orthogonal to the projection plane, knowledge about the interior camera geometry, the camera position and orientation dur-ing the moment of capture, and the topography is required. This information can be retrieved, capturing overlapping images and using SfM photogrammetry. The result is an orthophoto al-

262

lowing for distance and angle measurements. You can find more details regarding the process of calculating an orthophoto in chapter 2.2.

Figure 3.3-1: Captured scene can be distorted due to the influence of camera perspective and relief hindering scaled measurements. Oblique view at a planar terrain leads to increased scale overestimation with increasing distance to the camera projection centre. Terrain deviating from a plane leads to increased scale underestimation with decreased projection centre to object distance. Information about the relief has to be implemented for correct transformation of central projection to parallel projection. All figures were prepared by the authors for this chapter. 3.3.1.2 Image co-registration

To track the displacement of fast-moving objects, such as particles on water, it becomes nec-essary to capture images in a fast sequence, for instance, using videos. In most circumstances, UAVs are not able to capture the entire event from a stable position and orientation among others due to vehicle drifts and tilts caused by wind and due to vibrations of the sensor. If these movements are not mitigated, they will affect the calculation of correct flow velocity vectors. Therefore, image sequences need to be stabilized exploiting fixed targets, which can be identified in the image sequence.

Image stabilization can be achieved by identifying manually tie points or performing an au-tomatic detection and matching of points of interest (chapter 3.3.2 and 2.2). The information of the corresponding points is used to retrieve the parameters of a transformation matrix between the two images. Usually, either an affine transformation with six parameters (two scales, two shifts, one rotation, and on shear) is considered (Figure 3.3-2b) or a homography with eight parameters is estimated, where lines between both images still remain straight lines after the transformation (Figure 3.3-2c). With the retrieved transformation matrix, the source image will

263

be converted requiring the interpolation of a new image. In the end, the co-registered image se-quence has to be ortho-rectified for correct scaling of tracks (chapter 3.3.1.1) applying the same transformation to all images.

It has to be noted that the approach via tie points assumes that the surface is a plane, which can be a suitable approximation for higher flying heights and/or relatively flat terrain. Another requirement is that the UAV imagery captures stable areas distributed around the area of inter-est. This is not possible in all scenarios, for instance, if large areas are affected by movements. In such cases, other possibilities need to be considered. One option can be direct referencing (chap-ter 2.1). However, accuracy demands regarding position estimation with dGNSS, orientation reconstruction with the IMU, and camera synchronisation are very high, and future research has to reveal whether such an approach will be possible.

Figure 3.3-2: Distortion of the image due to off-nadir image acquisition and/or sloping terrain. (a) Un-distorted image. (b) Distorted image describable with affine transformation.

(c) Distorted image describable with perspective transformation (homography). 3.3.1.3 Image filtering

Tracking objects in image sequences can be sensitive to noise and low signal strength leading to ambiguities. Especially in environmental applications difficulties due to lighting conditions (e.g. glares and shadow) or water turbidity (e.g. transparent, clear water) have to be mitigated. Therefore, different image processing approaches might be considered to increase the robustness of data analysis.

Applying a low-pass filter is a possible method to decrease image noise. An option of image smoothing is convolution. A kernel or window with a specific size is applied to the original im-age (Figure 3.3-6). Possible kernels are a Gaussian kernel (Figure 3.3-3b), where the weight of the pixel decreases with distance to the centre pixel, a median kernel, which is especially suitable for salt and pepper noises, or a bilateral kernel, where the noise is reduced, but the edges are preserved. Further image improvements are possible via contrast enhancement (Dellenback et al., 2000), gamma correction (Tauro et al., 2017), histogram equalization (Dal Sasso et al., 2018) or intensity threshold criterion (Jodeau et al., 2008).

264

Another option to increase the robustness of image sequence analysis is the calculation of image derivatives, for instance, considering edges applying a Laplace operator (Figure 3.3-3c). To improve the signal strength, the histogram of the radiometric pixel values of an image can be modified. An example is the adaptive histogram equalization that amplifies the contrast in distinct image regions instead of applying a global histogram change (Pizer et al., 1987). Another approach to improve the signal for tracking is the calculation of derivatives from SfM (chap-ter 2.2), or Lidar (chapter 2.6) derived digital elevation models (chapter 3.4), e.g. considering hillshades to identify traceable features in the terrain.

Figure 3.3-3: Different options of image filtering to reduce the impact of image noise or to increase the tracking robustness. (a) Original image. (b) Gaussian filtered image for smoothing.

(c) Laplace filtered image to keep edges only for tracking. 3.3.2 Feature-based tracking

Feature-based tracking in image sequences can be separated into three processing steps: fea-ture detection, feature description, and feature matching. These steps are similar to the im-age matching approach during SfM, which was introduced in chapter 2.2. The result of fea-ture-based matching is in most scenarios a sparse set of correspondences. To find distinct and traceable image points, assumptions about the required feature shape are made. The feature has to reveal a large contrast to its neighbourhood, and the strong intensity changes have to occur in at least two directions. First- or second-order derivatives of the image can be calcu-lated to assess the radiometric gradients and their orientation. In flat areas, no changes in all directions are measurable. Along edges, intensity changes occur solely in one direction result-ing in ambiguous feature matches. Thus, blobs or corners are the interest operators of choice (Figure 3.3-4). As blob features were already introduced in detail in chapter 2.2, the focus lies on corner features.

265

Figure 3.3-4: Examples of unsuitable features as well as corners and blobs as suitable features for tracking. (a) Unfiltered, raw image. (b) Radiometric gradient filtered image.

An example of a corner feature detector is the Harris feature (Harris & Stephens, 1988). Image gradients are calculated via convolution using the Sobel operator. Thus, first derivatives are esti-mated for both image directions. Within local neighbourhoods, the distribution of the retrieved gradient intensities is assessed, and corresponding eigenvalues are calculated, making the feature detector rotation invariant. Finally, a score is computed from the eigenvalues. Both eigenvalues are high for corners. If they are only high for one eigenvector or low for both eigenvectors, an edge or flat area has been detected, respectively. Another corner feature is the Shi-Tomasi feature (Shi & Tomasi, 1994), which is especially designed for tracking tasks. The approach is similar to the Harris detector, however, the score function is different as both eigenvalues solely have to be above a minimum threshold.

Another possibility to extract features can be simply performed through the binarization of the images and identifying a threshold value, which allows to separate the background from the particles represented by brighter colours. Thus, the pixels at a higher intensity than the threshold will keep their value unaltered and pixels at lower intensities will be assigned a black colour (Figure 3.3-5). The procedure described above is called global threshold, but there are also other methods in the literature, such as: i) local threshold, which overcomes the limits of the global approach, varying the value of the threshold within the image depending on the light intensity, or ii) Otsu’s method (Otsu, 1979) which performs clustering-based image thresholding.

Figure 3.3-5: Binarization of radiometric information to apply a threshold (histogram) to keep points of interest, in this case, floating particles at the water surface.

266

The extracted features can be either used to estimate descriptors considering their local neigh-bourhood and subsequently matching these features or the features can be considered as points of interest for a subsequent patch-based matching approach. 3.3.3 Patch-based tracking

Patch-based tracking approaches define areas or patches, which are then tracked by searching for the corresponding location of the highest similarity in the next image. The areas to track can be chosen manually, defining regular grids, or considering the locations of detected features (chapter 3.3.2) to create templates. Dense sets of correspondences are possible, e.g. in the case of the definition of grids with high resolution. In patch-based tracking techniques correspondenc-es are found at locations where matching costs are minimal. Tracking can either be performed in the spatial or the frequency domain. 3.3.3.1 Tracking in the spatial domain

The most common approaches in the spatial domain are represented by the similarity and op-timization algorithms. In the case of similarity estimates kernels of finite size, with radiometric information extracted from the source image, are searched for in the target image. Thus, the kernel is moved across the search image to find the position, where the kernel information and the overlapping local target information are most similar (Figure 3.3-6). Different kernel functions can be applied in the convolution, e.g. considering the sum of squared differences (SSD). Another frequently used template matching function is the normalized cross-corre-lation (NCC), which accounts for brightness and contrast changes to increase the matching robustness. The results of the kernel applications are similarity maps, where the similarity peak (e.g. for SSD and NCC negative and positive, respectively) corresponds to the final position of the tracked feature.

267

Figure 3.3-6: Patch-based tracking approaches. Kernel k with information of image x-1 (source image) sliding across search image x (target image). At each pixel position x i,j in the extracted patch of the search image, corresponding to the overlapping area of the kernel, is computed with the kernel applying different functions. Different similarity measures R can be considered, e.g.

SSD (sum of squared differences) or NCC (normalized cross-correlation). Image displays a cross- correlation map, where NCC values were computed using a moving window over the search area. Diagram illustrates a 1D representation of sub-pixel interpolation by estimating the extreme value for a Gaussian fitted curve to NCC values along the x-axis of similarity image.

SSD and NCC have the disadvantage that both measures are sensitive to rotation, scale chang-es and shear. However, other patch-based matching such as optimization algorithms can over-come these constraints. An example is represented by the least-square-matching (LSM; Acker-mann, 1984; Förstner, 1982). LSM searches for the transformation matrix between two image patches such that the square of sums of grey value differences is minimized. For instance, if it is assumed that the corresponding patches are located in a plane, six parameters of an affine transformation are estimated (Figure 5.3-2b). This enables the tracking of distorted features, e.g. at stretching landslides, buckling glaciers, or rotating particles on rivers. The optical flow algorithm Lucas-Kanade (Lucas & Kanade, 1981), increasingly used in hydrological tracking tasks, is another optimization approach fitting an affine model to the motion field. Sub-pixel accurate measurements are possible, and the statistical output of the adjustment can be used to assess the matching quality. Due to the non-linearity of the adjustment, approximation values are required, which can be provided assuming solely minimal changes between images (e.g. in the case of high-speed imagery or very slow-moving objects), using the results of other matching approaches (e.g. NCC) as first estimates, or considering hierarchical approaches (chapter 3.3.3.3).

268

To find the position of highest similarity, it is also possible to estimate displacements in the frequency domain using the Fourier transformation. The phase correlation approach (e.g. De Castro & Morandi, 1987) calculates the cross-correlation between the Fourier transformed search and kernel patch to retrieve the phase shift in the frequency domain and thus lateral shift between both image patches in the spatial domain (Figure 3.3-7). Finding matches in the frequency domain is significantly faster than measuring in the spatial domain.

Figure 3.3-7: Simplified 1D representation of measuring phase shift θ between search (target) and source object in the frequency domain to retrieve displacement. 3.3.3.3 Improving robustness and accuracy

In most cases of patch-based tracking, the feature to track will not be located at the pixel centre in the search image due to signal discretization, i.e. the conversion of a continuous signal to a discrete (integer) value during the image capture process. Thus, to improve the matching accura-cy sub-pixel estimation can be necessary. One approach is the fitting of a paraboloid (Figure 3.3-6) at the position of the highest score in the similarity map and then extracting the coordinates at the local extreme value. The advantage of that method is that also the strength of the match can be evaluated considering the steepness of the paraboloid. Further parameters for quality assessment of the similarity measure are height and uniqueness of the estimated values.

Patch-based matching approaches can be further improved regarding their robustness and ac-curacy with hierarchical methods, which build image pyramids made off increasingly downsam-pled images to incrementally decrease image resolution (Figure 3.3-8). The tracking will start at the highest pyramid level, thus at the image with the lowest resolution. The search area can cover nearly the entire image. The position of the matching result is used as an approximation to confine the search area in the next pyramid level. These steps are repeated until the last level with the full image resolution, where the final location of the match is extracted. The hierarchical approach enables to mitigate the impact of choosing the right kernel and search window sizes. The larger the kernel is chosen, the less sensitive it is to ambiguities due to repeating patterns and

269

the smaller it is chosen, the higher the accuracy will be because more details are captured. And the larger and smaller the search window is chosen, the larger displacements can be captured and the faster processing times are achieved, respectively. Therefore, applying image pyramids allows for processing from the stage of high robustness at the first low-resolution levels to the stage of high accuracies at the last high-resolution levels.

Figure 3.3-8: Applying image pyramids to improve the tracking robustness and accuracy. The highest level corresponds to the image of the lowest resolution (first image), and the base level corresponds to the image of the highest resolution (last image). The matching result at each level serves as an approximation for the next level. The kernel has the same number of pixels in each level, and therefore different areas of the scenery are covered. Note that kernel size and downsampling are not scaled accordingly in this example to enhance the visibility of changes at different levels.

A further option to increase the accuracy of the tracking is the application of filtering algorithms to the final tracks. These can be either used globally, considering, e.g. the average and standard deviation of all measured displacements to identify outliers, or locally, considering, e.g. displace-ment statistics only within a specified neighbourhood. The latter approach is especially useful for objects with complex movement patterns. 3.3.4 Tracking strategies

Different spatial tracking strategies are possible for successful estimation of velocities and direc-tion of moving objects in UAV image sequences. First of all, it has to be considered if tracking is performed in stationary image sectors, thus where in each subsequent image tracking starts again at the same image coordinate, i.e. Euler approach, or if the track of a specific target in the image sequence is searched for, i.e. Lagrangian approach. The Euler method is generally compu-tationally more efficient with respect to the Lagrangian method. In return, the latter approach is able to perform measures also with low tracer density, whereas the former relies on abun-

270

dant seeding density. To identify matching regions or features, the concept of similarity between groups of particles in two consecutive images is used, but it is also possible to use multi-frame algorithms that use three or more consecutive frames to solve the problem of correspondences.

Once the particle positions are identified, the velocity is estimated by dividing the displace-ment of particles between consecutive frames by the time interval between the pair of images. A finite difference scheme is applied implicitly for calculating the velocity. Therefore, the temporal accuracy is directly correlated to the image frequency. Sampling frequency must be identified properly in order to avoid over- or undersampling that may lead to missed features or high ve-locity uncertainties if displacements are happening at the sub-pixel range, respectively. Different temporal tracking strategies are possible with different temporal bases, overlap and resolutions (Schwalbe, 2013, Figure 3.3-9). For instance, in a scenario of very slow-moving particles cap-tured with high framerates, instead of tracking consecutive frames illustrated by strategy two in Figure 3.3-9, it might be suitable to skip frames and track features subsampling frames at a lower frequency. This may help to enhance the visibility of shifts and movements of objects within each frame. Thereby, features or patches might be detected, e.g. every frame or every second frame (strategy four and three in Figure 3.3-9, respectively).

Figure 3.3-9: Temporal matching strategies (after Schwalbe, 2013).

To transform the measurements within the image sequences into displacements in a scaled co-ordinate system and correspondingly to metric velocity values, it is necessary to reference the tracking result (chapter 3.3.1.1). Referencing can be either performed prior to the tracking pro-cessing or afterwards. Executing the tracking in the original image, and thus transforming the image measurement afterwards, only considering the coordinates of the tracked particles, entails the advantage that interpolation errors, especially in strongly tilted images, are avoided.

271

3.3.5 UAV monitoring applications

The applications of tracking approaches to UAV data are vast and therefore entail very case-spe-cific challenges. Therefore, we display three common fields of application – hydrology (chap-ter 4.3), geomorphology (chapter 4.2) and glaciology (chapter 4.5) – to highlight different ad-vantages, challenges and limits of image sequence analysis of UAV-based data. 3.3.5.1 Streamflow

Image-based flow velocity measurement with UAV imagery is a valuable emerging flow gaug-ing technique, which can also be applied to terrestrial images captured by fixed station or mobile stations (Eltner et al., 2020). The advantage of using UAVs is the possibility for greater coverage of the river surface at multiple locations, including potentially inaccessible sites. Furthermore, they tend to fail less at high flow conditions compared to classical monitoring systems.

A vast number of methodological approaches are available to compute water surface veloci-ties. The most frequently adopted algorithms are large scale particle velocimetry (LSPIV, Le Coz et al., 2010), belonging to the Euler tracking strategy, and particle tracking velocimetry (PTV, Tauro & Grimaldi, 2017), belonging to the Lagrangian tracking strategy. LSPIV is an adaption of particle image velocimetry (PIV, Creutin et al., 2003). In contrast to PIV, LSPIV can be used for a wider range of physical phenomena due to its capacity to cover larger areas and to adopt low-cost cameras. Regardless of the specific algorithm considered for tracking, the estimated velocity is recovered from the information of tracing features on the water surface, i.e. natural foam, seeds, woody debris, and turbulence-driven pattern.

Accuracy assessments of UAV image velocimetry revealed that stationary UAV measure-ments are in strong agreement with established flow gauging approaches. To better under-stand the complexity of 2D river flow structures, following major points have to be respected: i) the stability of the camera, ii) a good compromise between flight altitude, camera resolution, tracer particle size and river width (Lewis & Rhoads, 2018), iii) the potential necessity of non-oblique UAV imagery at wider rivers to enable the coverage of the entire cross-section, and iv) the presence of a traceable pattern on the water surface. Seeding density is one of the most relevant parameters in the determination of reliable velocity fields. When facing low seeding density conditions, the number of analysed frames should be increased for more accurate results (Dal Sasso et al., 2018).

272

UAVs offer a cost-effective, time-efficient, flexible and safe data collection solution to improve the spatio-temporal resolution of landslide movement maps (chapter 4.2), e.g. through the com-parison of SfM-derived co-registered digital surface models (DSM) or using multi-temporal or-thophotos. Landslide tracking techniques applied to satellite, airborne or terrestrial data cannot be easily transferred to UAV-imagery, due to the different monitoring scales. Therefore, Lucieer et al. (2014) applied the COSI-Corr (co-registration of optically sensed images and correlation) algorithm (Ayoub et al., 2009) to hill shaded DSMs, instead of RGB imagery, to measure land-slide movements. In a further step, other UAV-derived morphological attributes, such as slope, openness and curvature, can be considered (Peppa et al., 2017). Furthermore, feature tracking approaches based on terrain break-lines can be more suitable to detect landslide movements with important surface deformation, whereas NCC-based correlation can be more appropriate when targeting small landscape elements.

The presence of vegetation can become an important challenge. For instance, image cross-cor-relation performance decreases when terrain surface is covered with grass. And vegetation’s neg-ative effect on correlation is even more pronounced when images were produced in different seasons (e.g. spring and winter). Although some errors are expected, especially over regions with rotational failures, UAV-based methods offer a reliable quantification of translational earth-flow activity, in particular, movement of ground material pieces, vegetation patches and landslide toes (Lucieer et al., 2014; Peppa et al., 2017). 3.3.5.3 Glacier

Similarly to landslide monitoring, UAV-acquired data can be beneficial to better understand gla-cial dynamics (chapter 4.5). However, applying UAV image-based processing can be particularly challenging in these landscapes due to large uniform surfaces, but whose texture can be enhanced by the presence of dust or debris. One of the challenges when quantifying glacier velocity is iso-lating ice movement from other surface displacements (e.g. debris slope collapse or falling blocks from the moraine on the ice surface) (Rossini et al., 2018). Application of a multi-scale mode, implemented in COSI-Corr, allowed for the exclusion of the majority of these noises. The best results involved a trade-off between limited noise, when using larger correlation windows, and fine-scale details. Besides orthomosaic, hillshaded DSMs and DSM derivates, e.g. detected edges, can also provide a globally coherent output. Feature-tracking algorithms used to compute glacier surface velocity can perform similarly compared to manual digitalization, and they enable fine spatio-temporal displacement quantification of debris-covered glaciers (Rossini et al., 2018).

273

References for further reading

274

3.4 Digital Elevation Models and their topographic derivatives

Giulia Sofia

3.4.1 Acronyms .................................................................................................................................. 2763.4.2 DEM generation ....................................................................................................................... 2773.4.3 Data processing and construction ......................................................................................... 2793.4.4 DEM accuracy .......................................................................................................................... 2823.4.5 Guidance ................................................................................................................................... 2833.4.6 DEM derivatives ....................................................................................................................... 2863.4.7 Redundancy and scale ............................................................................................................. 2893.4.8 DEM of difference .................................................................................................................... 2913.4.9 Quantifying spatially variable uncertainty ........................................................................... 2933.4.10 Final remarks .......................................................................................................................... 294The current development with UAVs is revolutionizing many fields in geosciences, at least for small-to medium scale studies. In comparison with traditional topographic surveys and modern techniques such as laser scanning and aerial photogrammetry, UAVs applications are generally cheaper, provide faster data acquisition and processing, and generate several high-quality prod-ucts with impressive level of details.

UAVs applications often rely on Digital Elevation Models (DEMs) to represent the topog-raphy, and on digital terrain modelling or geomorphometry (see Sofia (2020) for a recent

275

review), a steadily increasing range of techniques, providing a full objective description of landforms through descriptive measure of the surface form (Evans, 2012) in their purest form as elevation, slope, and aspect, and with increasingly sophisticated measures (Wilson, 2018; Hutchinson & Gallant, 2000a,b). These tools offer the best opportunity for under-standing the physical context of the Earth’s surface at spatial and temporal frequencies that are commensurate with rates of natural processes (Viles, 2016; Passalacqua et al., 2015; Tarolli, 2014).

The increase in the quality of UAV survey georeferencing, achieved mainly through the use of ground control points and real-time kinematic technology, led to the reproducibility and repeatability of multi-temporal spatial data (Clapuyt et al., 2017). As a consequence, the UAV-based multi-temporal digital surface models and orthophotos also provide the oppor-tunity to extend timescales of enquiry and, based on knowledge of forcing events during the monitoring period, inferences can be made about the processes evolution. Not only digital cameras but also more advanced geophysical sensors, including LiDAR (Lin et al., 2019), multispectral cameras (Diaz-Varela et al., 2014) or meteorological sensors (Spiess et al., 2007) can be mounted on-board UAVs. Ground penetrating radar (Chandra & Tanzi, 2015) or drone-mounted magnetometers (Versteeg et al., 2007) allow for underground surveys, for instance. Bathymetric LiDAR provide the technology for underwater surveys (Mandlburger et al., 2016).

Despite the quality of the software and data currently available, there is an uncertainty in-trinsic to the surfaces acquired by UAVs and this discrepancy needs to be assessed in order to validate the techniques applied. This points to a series of unique challenges regarding DEM pre- and post-processing, the uncertainties and their subsequent application, and the consistent representation of processes in the digital realm.

The importance of resolution has been deeply investigated and highlighted in geomorphology in general (Passalacqua et al., 2015; Tarolli, 2014), but the way we conceptualize the surface is also becoming more and more critical (Sofia, 2020). As survey techniques advances, problems arise because of insufficient resolution as compared to the landscape of interest (i.e. loss of local-ly significant features such as ridgelines and streams), and because of the scale-dependency of many descriptors (Bishop et al., 2012).

As well, we need improved algorithms to filter out vegetation, buildings and other hu-man-made structures in the DEMs that we can generate from remote sensing and UAV surveys. Finally, the default surface surveyed with drones is generally the top of the structures or vege-tation, and most geoscientific applications require a bare-earth DEM, therefore new challenges exist for the creation of DEMs from photogrammetry-based surveys.

276

Figure 3.4-1: Th e main tasks associated with digital terrain modelling and the sources of errors. Modifi ed and updated from (Wilson, 2012; Hutchinson & Gallant, 2000). All fi gures were prepared by the author for this chapter.

Challenges also emerge from the steadily growing number of parameters and algorithms for processing DEMs and defi ning descriptive measures and surface features. Because terrain anal-ysis is currently implemented in many commercial or open-source soft ware, procedures are im-plemented through diff erent methods and algorithms. Th e results of diff erent workfl ows oft en confl ict, leading to uncertainties due to the mathematical model by which land parameters are calculated, the size of the search window, and each one of these steps’ bias and limitations is generally transferred and accumulated to the next step (see Figure 3.4-1 for a typical workfl ow and sources of errors). Th e challenge is, therefore, to recognize and minimize uncertainties in data that are particularly elusive.

Aside from the challenge of deriving descriptive statistics, quantifying volumetric change us-ing UAV-based data is also a process prone to bias. Th e ability to develop spatially distributed models of topographic change generally relies on a DEM of Diff erence (DoDs – Wheaton et al., 2010), and requires the reconstruction of one or more geomorphic surfaces from which eleva-

277

tion changes can be computed. The quality and confidence in the topographic data available are usually the limiting factors in the accuracy and confidence in the resulting analysis.

This book chapter provides an overview of the state-of-the-art for a typical digital terrain modelling workflow, from DEM production, to surface modelling, and identification of mor-phological changes or DEM errors. This workflow is cross-disciplinary, and independent from the sensor used for the survey. Nonetheless, UAV surveys specifically require additional care and processing, that will be addressed throughout the chapter.

The remainder of the article is organised as follows. The next section describes the primary sources and methods for capturing elevation data, and it represents the methods used to pre-process DEMs along with some of the challenges that confront those who tackle these tasks. This section also describes the various kinds of errors that are embedded in DEMs and how these may be propagated and carried forward with the calculation of different land surface parame-ters. Chapter 3.4.6 describes the land surface parameters that are derived directly from DEMs to model water flow and related surface processes. Chapter 3.4.8 discusses above mentioned DoD. The final section offers some concluding remarks. 3.4.1 Acronyms

During the years, the concepts of Digital Elevation Model (DEM), Digital Terrain Model (DTM) and Digital Surface Model (DSM) have been used with a context-dependent implication. The use of different terms mostly relates to the technological development of surveying techniques (Table 3.4-1). The earliest definition of a Digital Terrain Model (DTM) dates back to the 50s and refers to ‘a statistical representation of the continuous surface of the ground by a large number of selected points with known xyz coordinates in an arbitrary coordinate field’ (Miller & Laflamme, 1958). As of today, in most cases the term digital surface model represents the Earth’s surface and includes all objects on it.

In contrast to a DSM, the digital terrain model (DTM) represents the bare ground surface without any objects like plants and buildings but may include other artificial features, such as road embankments (Li et al., 2004; Maune, 2001). DEM is often used as a generic term for DSMs and DTMs, only representing height information without any further definition about the sur-face. With the growing application of LiDAR, it is recommended to employ DTM for explicitly describing the bare-earth surface generated from LiDAR raw point clouds, while DEM is gen-erally recommended in studies based on photogrammetry (DEMs from structure-from-motion or satellite). Throughout this book chapter, the term DEM will be considered as any generic numeric representation of a topographic surface arranged as a set of regularly spaced points in a square grid.

278

Table 3.4-1: Various historical and current definitions of DEM, DSM and DTM.

The data sources and processing methods for generating DEMs have evolved rapidly over the past 20–30 years — from ground surveying and topographic map conversion to remote sensing with LiDAR, RADAR and UAVs (among others). Nelson et al. (2009) and Wilson (2012) provide an excellent overview of DEM production and generating sources, and they define three possible sources for DEM data: (1) Ground survey techniques, (2) digitalization of existing topographic maps, (3) remote sensing (airborne and satellite, laser systems, interferometry, and unmanned systems – airborne, terrestrial, underwater – and Time of Flight (ToF), hand-held or supported cameras). A succinct summary of the significant features of each of these options and their typ-ical application scale is shown in (Table 3.4-2).

279

Table 3.4-2: Significant features of survey options and their application scale [inspired and modified from (Nelson et al., 2009; John P. Wilson, 2012)] [v = vertical, h = horizontal accuracy]. LiDAR surveys from aerial or terrestrial laser scanner are generally the preferred support to represent fine-scale (in space and time) elements and obtain high-quality, high-resolution data. However, despite the high vertical and horizontal accuracy, LiDAR surveys often do not provide

280

the areal coverage or temporal conditions required for particular studies. Finer space-time res-olution topographic data can be derived from UAVs surveys and techniques based on Structure from Motion (SfM) and Multi-View Stereo (MVS). However, logistical constraints related to repeat surveys in the field, or the extent of coverage still exists (Eltner et al., 2016; Pearson et al., 2017; Smith & Vericat, 2015; Carrivick et al., 2016; Smith et al. 2015). Importantly, higher-res-olution data require great storage and computing capacity, and this restricts their existence to populated areas in wealthier nations, or to limited locations where researchers conduct their study.

A tremendous advantage in surveying techniques has been given by the use of spaceborne platforms for DEM generation. Notwithstanding the issues related to cloud coverage, DEMs can nowadays be quickly produced over large and inaccessible areas (near) real-time or within a rel-atively short time at a remarkable cheaper cost (Saeed et al., 2020; Purinton & Bookhagen, 2017). The ALOS World 3D – 30 m (AW3D30), ASTER Global DEM Version 2 (GDEM2), and SRTM-30 m, the TANDEM-X DEM (90 m), the MERIT DEM (90 m) have become available to the general public free of charge. A disadvantage of these DEMs, however, is that their resolution is insufficient for most applications except where relief is high, and the fact that many of the DEMs currently available are over a decade old. With the development of 1 m optical and stereo image-ry acquired from satellite-borne sensors, precision in the elevations of derived DEMs of meter scale is currently possible (Lane & Chandler, 2003). Examples of improved resolution global DEMs are offered by the High Mountain Asia (HMA) DEM (8 m) by NASA, the Arctic DEM (~0.5 m resolution) (also provided free of charge), the newly produced TANDEM-X (~12 m resolution in North-South direction), or the Pleiades-derived DEMs. Further improvement can be expected with the use of spaceborne LiDAR data (i.e. ICESat-2 data (Neuenschwander et al., 2019)) to be used, for example, for processing improvements, elevation control, void-filling and merging with data unavailable at the time of other spaceborne DEMs productions. 3.4.3 Data processing and construction

DEMs can be interpolated from irregularly spaced three-dimensional points collected from var-ious sources (Table 3.4-2). However, with differences related to the remote sensing sensor, before DEM interpolation, it is generally necessary to preprocess the data, to reduce systematic and random errors, and to enrich the quality of the DEMs.

It is widely accepted that the UAV-derived DEM accuracy from SfM-MVS, i.e., aerial or ter-restrial photogrammetry processing, is influenced by flight design and planning factors, such as GSD (ground sample distance), inclusion (or not) of oblique images, sensor and camera lens, flight pattern and georeferencing method, etc. (Manfreda et al., 2019). As well, flight altitude

281

influences the DEM quality, where lower flights produce better DEMs; in a similar fashion, over-cast weather conditions are preferable, but weather conditions and other factors influence DEM quality as well. Many works (Harwin & Lucieer, 2012; Hudzietz & Saripalli, 2011; James et al. 2017b) have analysed the effects of each of them. Standard DEM generation algorithms also suffer from typical errors obtained by the use of an onboard Global Positioning System (GPS) receiver, antenna and inertial measurement unit (IMU), incurring by prevalent systematic error or “drifts” in GPS camera positions.

Further pre-processing to remove non-ground points is also needed, especially to achieve accurate UAV-based DEMs for geomorphological applications. For this, many ground filtering algorithms exist, but the lack of standard data and unified evaluation systems limit objective comparisons of different methods (Uysal et al., 2015; Ozcan & Akay, 2018; Chiabrando et al., 2017). Filtering methods are generally classified into four different categories (Sithole & Vos-selman, 2004): slope-based (Vosselman, 2000); surface-based (Wan & Zhang, 2006); clustering/segmentation (Sithole & Vosselman, 2005); and block-minimum algorithms (Sithole, 2005). Discussions about standard algorithm performances can be found in Uysal et al. (2015); Ozcan and Akay (2018); Chiabrando et al. (2017); Sithole and Vosselman (2004); Sithole and Vos-selman (2005); Wan and Zhang (2006); Sithole and Vosselman (2005); Sithole, (2005). While these works refer mostly to LiDAR filtering, similar approaches can be used for UAV-based surveys, and further filtering approaches can be found in Pijl et al. (2020); Yilmaz and Gungor (2018); Yilmaz et al. (2018); Zeybek and Şanlıoğlu (2019) .

According to literature, surface-based algorithms are the most commonly used, and generally perform better (at least with high-resolution data). This is maybe because they use more context compared with other filtering algorithms (Sithole & Vosselman, 2004). The basic concept of sur-face-based filtering algorithms is to create a parametric surface that can approximate the actual ground surface. According to previous studies (Zhang and Lin, 2012), surface-based filtering algorithms are generally divided into three subcategories, including morphology-based; itera-tive-interpolation-based; and progressive-densification-based filters. Morphology-based filters conduct a series of morphological operations such as the opening and closing on rasterised point clouds. Alternatively, the progressive-densification-based filters utilise an initial triangular irregular network (TIN) to represent ground surface. The initial TIN surface is then progressive-ly densified under strong constraints. The iterative-interpolation-based filters iteratively approx-imate the true ground surface using various interpolation algorithms.

Once the data have been filtered, it is necessary to use interpolation to estimate the elevation values at a higher resolution (throughout the landscape). Theoretically, the resolution of such surface (or scale) and the density of measurements required to obtain a specified accuracy is dependent on the variability of each terrain. The point density must be high enough to capture the smallest terrain features, yet not too fine so as to over-sample the surface, in which case there

282

will be unnecessary data redundancy (Petrie & Kennie, 1987). The resolution of the obtainable DEMs also depends on the density of the input points: a pixel size of two or three times the aver-age point distance should be preferred (see i.e. Anderson et al., 2006 for LiDAR).

Literature on UAVs generally considers standard DEM interpolation approaches as they are available in numerous commercial and non-commercial software. The following paragraphs will highlight possible interpolation cross-platform methods used in literature (Li et al., 2004). Inter-polation methods generally fall into two groups: local and global. Local methods operate around the position of the predicted point (neighbourhood), within an extent smaller than that of the study area. Examples of local deterministic methods are the Inverse Distance Weighting (IDW), local polynomials, and Radial Basis Functions (RBFs). On the other hand, global interpolation methods (i.e. Kriging) use all the available sample points to generate predictions for the whole area of interest. These methods can be used to evaluate and remove global variations caused by physical trends in the data.

Among the most commonly considered local method, the IDW calculates the value as a dis-tance-weighted average of sampled points in a defined neighbourhood (Manson et al., 1999). It considers that points closer to the query location will have more influence, and weights the sample points with inverse of their distance from the required point.

Nearest Neighbor interpolation finds the closest subset of input samples to a query point and applies weights to them based on proportionate areas (Sibson, 1981). It is a local deterministic method and interpolated heights are guaranteed to be within the range of the samples used. It does not produce peaks, pits, ridges or valleys that are not already present in the input samples, and adapts locally to the structure of the input data. It does not require input from the user and works equally well for regularly as well as irregularly distributed data, and it produces reliable surfaces for morphological analysis (Pirotti & Tarolli, 2010; Boissonnat & Cazals, 2001).

The Spline interpolation approach uses a mathematical function to minimise the surface cur-vature and produces a smooth surface that exactly fits the input points. Advantages of splin-ing functions are that they can generate sufficiently accurate surfaces from only a few sampled points and they retain small features. A disadvantage is that they may have different minimum and maximum values than the data set and the functions are sensitive to outliers due to the in-clusion of the original data values at the sample points.

The ANUDEM (Hutchinson, 2011) method uses an interpolation technique specifically de-signed to create a surface that more closely represents a natural drainage surface and preserves both ridgelines as well as stream networks.

Kriging (Wu, 2017) is a geo statistical interpolation method that utilizes variograms which de-pend on the spatial distribution of data rather than on actual values. Kriging weights are derived using a data-driven weighting function to reduce the bias toward input values, and it provides the best interpolation when good variogram models are available.

283

3.4.4 DEM accuracy

The accuracy of DEM is a function of several variables such as the roughness of the terrain sur-face, the interpolation function, interpolation methods and other attributes (accuracy, density, and distribution) of the source data. The latter is, as well, influenced by both systematic (e.g. accuracy of the survey equipment and of the method chosen) and random errors (e.g. tilting of the pole when surveying with a dGPS), that are uneven across a surface, with generally low error across uniform surfaces and increased error associated with breaks of slope.

Overall, DEM errors can refer to:

1. Data errors due to the age of data, the incomplete density of observations or results of spatial sampling.

2. Measurement errors such as positional accuracy, data entry faults, or observer bias.

3. Processing errors such as numerical errors in the computer, errors due to interpolation or classification and generalisation problems.

DEMs obtained through UAV aerial images appear to provide relatively high accuracy (Anders et al., 2013; Rusli et al., 2019; Mancini et al., 2013; Chandler et al., 2018; Colomina and Molina, 2014). Factors that influence the final error associated to UAV-derived DEM are for example (Uysal et al., 2015; Ruiz et al., 2013): camera-to-ground distance, camera-sensor system param-eters, image network geometry, matching performance, terrain type, lighting conditions refer-encing methods.

DEM errors are not easily detectable and can introduce significant bias. Error assessment is often carried out with limited control data, and it generally only accounts for absolute horizon-tal and/or vertical accuracy. This measurement of accuracy, however, presents two significant limitations. It does not represent the accuracy of higher-order DEM derivatives (e.g., slope and curvature), geomorphic metrics, or landscape features of interest to geoscientists. The problem is especially acute given that relatively small elevation errors will propagate in the first (slope) and second (curvature) derivatives, potentially obscuring geomorphometric results (e.g., (Sofia et al., 2013; Albani et al., 2004; Oksanen & Sarjakoski, 2005). It does not incorporate spatial autocorrelation of uncertainty. Errors in spatial data are generally spatially auto-correlated. For example, an error in a benchmark measurement will affect resulting elevation values developed from that point. Thus, uncertainty regarding this error is also spatially dependent. This can cre-ate systematic biases in DEMs and poses a problem for non-spatial statistical methods used to define map accuracy (such as the Root Mean Square Error – RMSE –).

284

Many studies have investigated improved methods to identify systematic errors in DEMs (So-fia et al., 2013; Heritage et al., 2009; Oksanen & Sarjakoski, 2005; Xuejun & Lu, 2008; James et al., 2017b). Semivariograms and fractal dimensions have been shown to analytically confirm the presence and structure of systematic errors in DEMs, and multiple authors suggested filtering as a means to reduce biases (Milan et al., 2011; Brown & Bara, 1994). Further studies highlighted how to observe the DEM derivatives to infer about DEM errors by calculating the average value of the land surface parameter from multiple equiprobable realizations of the same DEM (Sofia et al., 2013; Zandbergen, 2011; Hengl, 2006). Fisher and Tate (2006) review the source and nature of errors in DEMs, and in the derivatives of such models, highlighting methods for the correc-tion of errors and assessment of fitness for use. Wechsler (2007) brings together a discussion of research in fundamental topical areas related to DEM uncertainty that affect the use of DEMs for hydrologic applications. Januchowski et al. (2010) offers an interesting point of view on the benefits gained from having less error in a model or to the corresponding cost associated with reducing model error by choosing one product over another. While not being specifically for UAV-derived products, the mentioned works offer interesting starting point for investigating DEM quality and accuracy on UAV-based studies. Additionally, Goetz et al. (2018) provide an example of error determination for UAV-based SfM DEMs, and define how DEM error can be described differently depending on the available validation data. Examples of how to mitigate systematic error in topographic models derived from UAV and ground‐based image networks are provided by James and Robson (2014). 3.4.5 Guidance

The real applicability of UAV or other sensors-derived DEMs, for any analysis should be assessed depending on the aim of the study. Ideally, it should be evaluated by investigating landforms (how accurate is the shape of the land in the digital landscape?), features (how accurate are ridges and flow lines within the landscape?), surface roughness (how accurately is landscape roughness portrayed in the digital realm?), and consistency (is elevation consistent throughout the landscape?) (Wilson, 2012). These questions should also help the image acquisition phase.

One approach to investigate the above-mentioned questions is the Explore-then-exploit tech-nique (i.e. Roberts et al., 2017) that involves two phases. In the “explore” phase, an initial path is planned with a uniform distribution of views for the area to be reconstructed, similar to an off-the-shelf flight-planner. Next, images are quickly collected by the UAV and utilized to generate an initial rough model called a geometric proxy. This geometric proxy can be investigated to observe roughness, landforms, features and consistency. In the “exploit” phase, a new trajectory

285

is planned according to the geometric proxy and the collection and reconstruction processes are repeated to generate a high-quality 3D model.

The real applicability of a DEM, guided by the above-mentioned principles, is highly influ-enced by all the processing phases, from data pre-processing and DEM creation.

Regarding the accuracy of landform, feature and roughness representation, the pre-pro-cessing phase is the most challenging. To model geomorphological or hydrological processes, it is important to retrieve the actual terrain surface, and not the vegetated surface. In surface models, trees and shrubs are represented as impenetrable obstacles, while in reality water and sediment flows around the stems of such vegetation. Modelling hydrological behaviour and/or sediment transport with surface models will likely to lead to wrong assumptions. Developing terrain surfaces from image-based point clouds, especially for areas under dense over-ground coverage (vegetation, buildings) is an area of active research, due to the difficulty of obtain-ing a suitable number of under-coverage images from multiple perspectives (Sammartano & Spanò, 2016). For UAV surveys of relatively smooth landscape, good filtering results can be achieved using algorithms that recognize objects according to their variation of height and density from ground (as an almost plane level). For more complex topographies, adaptive algorithms should be preferred, where the threshold for filtering varies depending on the slope of the terrain (i.e. Pijl et al., 2020; Sammartano & Spanò, 2016). Comparative studies highlighted how, among others, adaptive TIN algorithms perform better for the filtering of image-based data (Yilmaz & Gungor, 2018; Yilmaz et al., 2018), depending however on the characteristics of the study area.

Figure 3.4-2: Photogrammetric DEMs of a portion of the Tecolote Volcano, Pinacate Volcanic

Field, Sonora, Mexico (a) (Scott et al., 2018) created using IDW (b), Natural Neighbor (c),

Spline (d), and Kriging (e).

286

For the DEM creation, one should keep in mind that different interpolation methods applied over the same data sources may result in different results, and hence it would be preferable to evaluate the comparative suitability of these techniques. A challenge in error assessment is that, practically, it is not always possible to measure true elevation from ground because of time and accessibility. Instead of determining the absolute accuracy of the DEM, it is more common to measure the relative accuracy in comparison with sample point measurement known to be of a higher order of accuracy. Bell (2012) offers an exciting article summarising and identify-ing sources of error arising from the interpolation approach. This work enables to assess the statistical characteristics of error, their spatial statistical structure and deviance of distribution as a means to easily understand spatial structure. As an example, Figure 3.4-2 shows a photo-grammetric model of the Tecolote Volcano, Pinacate Volcanic Field, Sonora, Mexico (Scott et al., 2018) created using IDW (b), Natural Neighbor (c), Spline (d), and Kriging (e). Table 3.4-3 reports some statistical measurements.

Table 3.4-3: Some statistic measure of a DEM derived using different interpolation techniques. DEMs are showed in Figure 3.4-2.

From a visual interpretation (Figure  3.4-2), it appears that overall roughness of the models changes, depending on the interpolation technique used. The Spline model (d) emphasizes rocky outcrops and erosional elements, while the IDW (b) seems to present more distributed roughness and striping artifacts. A quick review of statistical parameters (Table 3) shows that all interpolated surfaces illustrate negative skewness quite close to zero, suggesting the high resolu-tion of the SfM dataset enables accurate interpolation of surfaces. The Natural Neighbor model (d) performs least favourably in terms of skewness with the distribution being slightly more neg-atively skewed than other techniques. The standard deviation of the spline, however, is highest for all interpolation approaches demonstrating most significant variation in values around the mean. A greater spread of values and peak around mean suggests potential sources of error and a reduction in accuracy of the interpolation approach.

Given the example showed in Figure 3.4-2, it is essential that surface composition and top-ographic complexity are considered before selecting an interpolation algorithm. In fluvial ge-

287

omorphology, for example, Delauney triangulation or TINs are often used (Brasington et al., 2000) or kriging (Fuller et al., 2003). Both of these schemes have been suggested as being the best interpolators for landscape surface data (Holmes, 2016), with TINs being computationally efficient and well suited to discontinuous shapes such as ridges, and breaks of slope (Wilson & Gallant, 2000a,b) and return lower elevation errors in comparison to other interpolation schemes.

Finally, for real DEM applicability, DEM pre-processing might also be required, and it strict-ly depends on the final aim of the study. An important aspect worth mentioning is the use of UAV DEMs for hydrological studies (i.e. Govedarica et al., 2018; Sammartano & Spanò, 2016; Pineux et al., 2017; Leitão et al., 2016). From a hydrologic perspective, the development of UAV techniques should be driven by the necessary inputs to a hydrologic model or the potential for utilizing the imagery to test the model predictions. Hydrological analysis requires the ability to simulate flow movements correctly in the digital landscape. For this type of pre-processing (hydrological correction), challenges are introduced with increasing resolution because of the effect of artefacts such as systematic DEM errors and small features creating blockages in the landscape, or, on the other hand, from the identification of sink that could be part of the inves-tigated landscape (Callaghan & Wickert, 2019). Different pre-processing techniques produce different results (Lidberg et al., 2017). As well, hydrological pre-processing alters the landscape, to the point that the created DEM should not be applied for other analysis (i.e. morphological ones). Careful investigation of hydrologic correction can ensure that UAV based DEM make their way into products that directly quantify the hydrologic cycle and improve predictive skill at a range of resolutions. 3.4.6 DEM derivatives

A full objective description of landforms from DEMs is achieved through descriptive measure of the surface form (Evans, 2012) in their purest form as elevation, slope, and aspect, and with increasingly sophisticated measures. This chapter will provide a collection of DEM derivatives. The reader should refer to (Wilson, 2018; Hutchinson & Gallant, 2000a,b) to have a complete view of this subject.

Most local topographic variables can be derived from elevation (z) values within a neigh-bourhood of each point of the land surface (z). z is given by z=f (x,y) where x and y are plan Cartesian coordinates. This implies that caves and empty spaces are not currently possible to be represented by surface derivatives. The derived landscape parameters are functions of partial de-rivatives that can be calculated with regular (square-gridded) DEMs by various methods includ-ing several finite-difference methods using moving windows e.g. (Wood, 2009); and analytical

288

computations based on DEM interpolation by local splines or global approximation of a DEM by high-order orthogonal polynomials (Florinsky & Pankratov, 2016).

DEM derivatives and topographic parameters can be classified based on their mathematical properties (Florinsky, 2017; Evans & Minár, 2011) and can be grouped into four main classes: (1) local variables; (2) non-local variables; (3) two-field specific variables; and (4) combined variables.

A local morphometric variable is a single-valued bivariate function describing the geometry of the topographic surface in the vicinity of a given point of the surface (Speight, 1974) along directions determined by one of the two pairs of mutually perpendicular normal sections, and they include first order (i.e slope)and second order (i.e. curvature) derivatives.

A non-local (or regional) morphometric variable is a single-valued bivariate function describ-ing a relative position of a given point on the topographic surface (Speight, 1974). To estimate non-local variables, we generally rely on flow routing (FR) algorithms. These algorithms deter-mine a route along which a flow is distributed from a given point of the topographic surface to downslope points. FR algorithms can be classified based on their mathematical base: (1) sin-gle-flow direction (D8, Figure 3.4-3b) algorithms that use one of the eight possible directions separated by 45º to model a flow from a given point (Martz & Garbrecht, 1992); and (2) mul-tiple-flow direction (MFD, Figure 3.4-3d) algorithms using the flow partitioning (Quinn et al., 1991). There are some methods combining D8 and MFD principles i.e. D-infinity (Figure 3.4-3c) (Tarboton, 1997). Overall, while both approaches perform within an acceptable rate of ap-proximation for convergent hillslopes, in divergent landscapes the D8 method has disadvantages arising from the discretization of flow into only one of eight possible directions separated by π/4, which results in a loss of information about the real flow path and leads to biases of flow lengths (Figure 3.4-3b). Comparisons among algorithms can be found in Armitage (2019), Orladini et al. (2011) and Hutchinson et al. (2013).

A two-field specific morphometric variable is a single-valued bivariate function describing re-lations between the topographic surface (located in the gravity field) and other fields, in particu-lar, solar irradiation and wind flow. These variables are functions of the first partial derivatives of elevation (as in local variables) and angles describing the position of the Sun in the sky. Example of this is topographic openness (Yokoyama et al., 2002) (Figure 3.4-3g, h).

Morphometric variables can be composed of local and non-local variables. Such attributes consider both the local geometry of the topographic surface and a relative position of a point on the surface, and focus on water flow/soil redistribution or energy/heat regimes (Wilson & Gallant, 2000a, b) Among combined morphometric variables are the topographic index and the stream power index (SPI, Figure 3.4-3f) (Wilson & Gallant, 2000a, b) and some others. Com-bined variables are derived from DEMs by the sequential application of methods for non-local and local variables, followed by a combination of the results.

289

Figure 3.4-3: Various topographic parameters: Flow accumulation according to the D8 (b),

Dinf (c) and MDF (d) method, total curvature (e), Stream Power Index – SPI – (f) and positive (g) and negative (h) openness, evaluated for a 1 m DEM from LiDAR (Pirotti & Tarolli, 2010). Flow directions, curvature and SPI, are computed using ArcGis 10.6,

Openness is evaluated using SAGA.

Slope and aspect have been well known in geosciences for many decades, and so there is no need to specify their fields of application. Curvature (Figure 3.4-3e) is systematically used in geomorphic studies to describe, analyse and model landforms and their evolution, to study re-lationships in the topography-soil-vegetation system and to perform predictive soil and vegeta-tion mapping, to reveal hidden faults as well as to study fold geometry (Drăgut & Dornik, 2013; Tarolli, Sofia et al., 2012), or to recognize thalweg (negative values, Figure 3.4-3e) and crest lines (positive values, Figure 3.4-3e) (Clubb et al., 2014; Passalacqua et al., 2010). Two-field specific variables are generally the most straightforward approach to visualise landscapes (hillshade or shaded-relief maps) (Chase et al., 2014; Devereux et al., 2008). Flow-routing compound indices, such as the Stream Power Index (SPI, Figure 3.4-3f) are widely used in hydrological and related soil, plant and geomorphic studies, or erosion and soil research (Ferencevic & Ashmore, 2012).

290

Complexity emerges because geomorphometric analysis is currently implemented in many commercial or open-source software, and it is implemented through different methods and al-gorithms. The results of different workflows often conflict, leading to uncertainties about feature locations, and each one of these steps’ bias and limitations is generally transferred and accumu-lated to the next step. Two aspects are worth to be mentioned, referring to DEM derivatives and their applicability: redundancy and scale.

Redundancy . Currently, more than 100 land surface parameters exist (Wilson, 2018). Many of these land surface parameters incorporate flow direction, and they make use of one or more of the many flow direction algorithms that have been proposed during the past decades. Some parameters might not be actually unique (Gessler et al., 2009) . Different methods for parameter evaluation implies different characteristics of the map, and the results are always dependent on the generalisation/resolution and quality of the DEM. Figure 3.4-4 shows slope computed ac-cording to Zevenbergen and Thorne (1987) and Evans (1972) using a 1 m LiDAR DEM (Pirotti & Tarolli, 2010) and it displays differences between each map, especially in steep terrain.

Comparative studies in various disciplines, ranging from hydrology (Buchanan et al., 2014; Sørensen et al., 2006), natural hazard (Barbarella et al., 2017; Favalli & Fornaciai, 2017), wa-tershed analysis (Liffner et al., 2018), soil science (Song et al., 2016) prove that a calculation method that performs best for all measured variables does not exist; instead, the best method is generally variable, site-specific and specific to each field-of-study. It is highlighted, therefore, the importance of clarifying the choice of the considered parameter, procedures and analysis should be described in a sufficiently detailed and in a transparent way. Without sufficient knowledge of the processes and the software being used, comparative studies can potentially invest greater confidence in the results than may be warranted.

Figure 3.4-4: Example of slope evaluated according to Evans (1972) (a) and Zevenbergen and Thorne (1987) (b) algorithms and relative changes (c). The slope is evaluated using SAGA.

Parameters are evaluated from a 1 m DEM from LiDAR (Pirotti & Tarolli, 2010).

291

Scale . DEM resolution impact is a well-known subject in DEM application. Nonetheless, a fur-ther scale issue emerges regarding DEM derivatives. Literature discussed how the highest reso-lution does not always imply the optimal information, and the use of different scale of analysis algorithms produce multiple interpretations of a single phenomenon (see Sofia, 2020 for a full review). DEM derivatives are generally computed using a neighbourhood around the data (i.e. a moving window, generally of 3x3 pixels). DEM-derived parameters are much less sensitive to resolution changes than to variation in neighbourhood size (Sofia et al., 2013; Smith et al., 2006).

Figure 3.4-5 exemplifies the effect of choosing a correct window size to evaluate topographic parameters. Roughness can be associated with the presence of landslides or erosive processes (Booth et al., 2009; Tarolli, 2014). Cavalli et al. (2008) define roughness as the standard deviation of residual topography (Figure 3.4-5). A rougher topography is identified by high residual topog-raphy variability (Tarolli, 2014), and it can be measured over sampling windows of a fixed size that are moved over the DEM (i.e 3x3, Figure 3.4-5b, 15x15 Figure 3.4-5c, or 33x33 Figure 3.4-5d).

Generally, smaller windows are more sensitive to noise and errors (Figure 3.4-5b), but win-dows that are ‘extremely’ large, are not sufficient to capture the morphology of interest (Fig-ure 3.4-5d). Windows that are two to three sizes the size of the feature of interest should be preferred (Pirotti & Tarolli, 2010).

A further issue to consider is related to microtopographic noise. This noise is ubiquitous, espe-cially in high‐resolution DEMs from LiDAR or SfM. A branch of literature addressed this issue by filtering techniques taken from image analysis such as diffusive smoothing, optimal Wiener filtering, or nonlinear diffusive or Perona‐Malik filtering (Clubb et al., 2014; Pelletier, 2013; Pel-letier & Perron, 2012; Passalacqua et al., 2010).

Figure 3.4-5: Examples of roughness index by Cavalli et al. (2008) for a landslide area in northern

Italy (a) evaluated at different moving windows with size 3x3 (b), 15x15 (c) and 33x33 m (d).

Roughness is evaluated using ArcGis 10.6. LiDAR data are provided by Pirotti and Tarolli (2010).

292

The readers should refer to Sofia (2020), Drăguţ & Dornik (2013); Drăguţ et al. (2011); Drăguţ & Eisank (2011a); Drăguţ & Eisank (2011b); Sofia et al. (2011); Sofia et al. (2017), Minar & Evans (2015); Evans & Minár (2011); Minár & Evans (2007) for different views and applications of scales ranging from scale effects to scale optimisation techniques. As a general rule, elementary landscape forms (segments, units), as the signature of processes, are defined by constant values of fundamental morphometric properties and limited by discontinuities of the properties. This literature suggests that to identify the underlying process correctly, we must identify the scale that maximises internal homogeneity and external differences. 3.4.8 DEM of difference

A further step in landscape characterisation is offered by change detection techniques (İlsever & Ünsalan, 2012; James et al., 2012), which has gained significant attention due to its capability of providing variations of volumetric and planimetric measures. Readers should refer to (Qin et al., 2016) for a review. These techniques rely on the availability of multiple topographic data cover-ing the same area of interest, real (across time) or simulated, to be used to compare topography. The change detection can either be applied volumetrically, using DEMs (e.g. Bangen et al., 2014; Wheaton et al., 2010; Lane et al., 2003), or in plan, where geomorphological features are delim- Figure 3.4-6: Time-series change detection (e) between 2005 (a, b) and 2008 (c, d) for an anthropogenic landscape in Spain (LiDAR DEMs at 1 m resolution and orthophotos are from

(Institut Cartografic De Catalunya (ICC) 2005; Institut Cartographic de Catalunya (ICC) 2008). Change detection was performed using the GCD toolbar for ArcGis 10.6.

293

ited from remote sensing imagery or cartography (e.g. Hooke & Yorke, 2010). For this chapter, we are focusing mostly on volumetric change detection, where two DEMs that share the same geodetic reference are subtracted from one another to reveal morphological changes related to processes (Figure 3.4-6), or to DEM processing (Figure 3.4-7).

Figure 3.4-6 shows an example of DoD related to two LiDAR Surveys, carried out in 2005 and 2008 for the same area. Summing the total change across the DoD quantifies volumetric changes, and highlight patterns related to either deposition (A in Figure 3.4-6e) or erosion (B in Figure 3.4-6e).

Applications of DoD in earth-surface processes research often center on monitoring and detecting change within a system over time. Nonetheless, this technique can be useful also to identify and assess the quality of a DEM as compared to a reference dataset (Figure 3.4-7). The advantage of using DoD to address error, is that it allows to identify patterns and location where such differences might be present. Figure 3.4-7 shows a LiDAR DEM at 30 m resolution as com- Figure 3.4-7: Barringer Crater (AZ, USA (a)) and detection of errors (b) between a

30 m LiDAR (c) and the 30 m SRTM DEM (d). The original LiDAR at 0.25 m and the

SRTM for the same area are free and available for download at http://opentopo.sdsc.edu/ raster?opentopoID=OTSDEM.112011.26912.3 [LiDAR data acquisition and processing completed by the National Center for Airborne Laser Mapping (NCALM – http://www.ncalm.org). NCALM funding provided by NSF’s Division of Earth Sciences, Instrumentation and Facilities Program. EAR-1043051; SRTM https://doi.org/10.5069/G9445JDF] Change detection was performed using the GCD toolbar for ArcGis 10.6.

294

pared to an SRTM DEM with the same pixel quality (30 m) (d). Applying a DoD between the two datasets reveals areas of differences, and it allows to highlight artifacts in the SRTM DEM. The elevation differences aligned in straight lines in Figure 3.4-7b are due to striping artifacts, a common error in SRTM DEMs (Stevenson et al., 2010)

Similar DoD studies, considering DEMs from UAVs and DEMs from a reference survey (i.e. airborne LiDAR), would allow to quantify errors related to the UAV survey, and define its accuracy. 3.4.9 Quantifying spatially variable uncertainty Differencing sequential sets of DEMs can be used to detect and quantify geomorphic change to understand processes on infer about the quality of a DEM. Nonetheless, loss of valuable infor-mation concerning landscape change may result in areas where the mean error is higher than the change being measured. This is of crucial significance in small-scale erosion studies (e.g. Kaiser et al., 2018), where changes are often very subtle in nature, and their magnitude is similar to that of uncertainties.

It is, therefore, essential to understand the distinction between the process of estimating errors of individual DEMs (chapter 2.2), and the process of propagating those errors and choosing a technique by which to threshold the DoD to separate noise from signal. Researchers should also account for the fact that biases are likely to be spatially variable. Therefore, the signal to noise ratio is likely to vary across an area of interest, with different degrees depending on landscape complexity and the survey system.

At all scales and for all application, users must understand: (i) the technology and its limita-tions at the time of data collection; (ii) how post-processing steps (point cloud classification and generation of the gridded product) for each individual data set might affect the results; and (iii) georeferencing information for the original data, as systematic errors can be introduced at any one of these steps.

To produce a more realistic spatial representation of morphological changes, literature sug-gests taking into consideration the spatial patterns of DEM errors (Javernick et al., 2014; Lane et al., 2003; Milan et al., 2007). This can be accomplished, for example, through stochastic reali-sation of the same DEMs (Hawker et al., 2018), and by comparison with a reference survey (i.e. Figure 3.4-7). With these comparisons, it is possible to derive a spatially distributed estimation of errors, to be further considered in the DoD. Wheaton et al. (2010) also present a technique for estimating the magnitude of DEM uncertainty in a spatially variable manner through the use of fuzzy set theory.

An assessment of the error in the DoD derived quantity can be made formally, assuming that both inputs can be treated as independent (Brasington et al., 2000). Independently from the way

295

DEM errors are estimated (e.g. spatially uniform, fuzzy inference systems, user-specified spatial-ly variable), there are various possible combinations for propagating these errors into the DoD.

In the simplest approach, accuracy measures are applied as ‘minimum level of detection’ (LoD) to account for the propagated error in the considered dataset, to perform the geomorphic detection. This approach presents one main limitation, which is that the statistic is generally averaged across the whole surface.

As an alternative, a probabilistic approach is suggested for the determination of the uncer-tainty in the magnitude of change for each data point in a DEM of difference (Lane et al., 2003; Brasington et al., 2003). They show how probabilistic thresholding can be carried out with a user-defined confidence interval. Following this method, an error-reduced DoD can then be obtained by discarding all changes with probability values less than the chosen threshold. As a more advanced approach, Wheaton et al. (2010) suggest using Bayesian statistics with updated additional information (e.g. spatial coherence filters) to define the threshold for the DoD. These more advanced approaches present less-conservative volumetric estimates, in comparison to using a spatially uniform LoD, and provide more plausible and physically meaningful results (Prosdocimi et al., 2016).

Poor quantification of uncertainty can erroneously over- or underestimate real change. Espe-cially when applying uniform thresholds, overestimates change in areas where change would not be expected, such as stable hillslopes, and underestimates the changes in areas where it is expect-ed. More appropriate results are obtained when using a spatially variable DEM error model that combines the influence of various error sources, such as slope, point density, and vegetation, for example) in a fuzzy inference system (Prosdocimi et al., 2017; Vericat et al., 2015; Wheaton et al., 2010). 3.4.10 Final remarks

DEMs and DoD are critical for geoscientific studies focusing on the description and classifica-tion of landforms, on the dynamical processes characterising their evolution and existence and on their relationship to and association with other forms and processes. Three critical points will yield substantial benefits in the use of DEMs for landscape analysis.

It is important to improve our knowledge of the presence of and propagation of errors in both the current and new remote sensing data sources that emerge. DEM quality and reso-lution must be consistent with the scale of the application and of the processes that are mod-elled, the size of the land surface features that are to be resolved, and the study objectives. DEM errors should always be accounted for, including information about their correlation

296

and entity, and creating approaches that account for variable sources of errors, especially when dealing with DoD.

The second critical element concerns scale effects. The rapid advent and adoption of high-res-olution remote sensing digital elevation data sources mean that there is an urgent need to im-prove our understanding of how these fine-scale data influence the computed land surface parameters. Higher resolution does not always imply a better representation of surfaces, but it also comes with higher level of possible noise being captured. As well, many vital parameters can be understood at lower-resolution, given careful consideration of how analyses are per-formed.

The availability of medium-sized global models also calls for a more comprehensive study of scale. We must also account for the scale at which we infer topography, in terms of window of analysis, and we must put careful consideration into how terrain analysis moves across scales.

The third point is critical for those interested in calculating one or more of the described land surface parameters as a part of some digital terrain modelling workflow and using the results as inputs in some environmental application(s). At each step (i.e. pre-processing, filtering, DEM interpolation, evaluation of derivatives) researchers must choose wisely among the various op-tions available for each task, while paying particular attention to the research goal, the advan-tages and disadvantages of different data sources and digital terrain modelling techniques, the characteristics of their study area, and how errors might have been introduced and propagated in their workflows, and the significance of these errors for the results that are produced. Crit-ical thinking must be put into the interpretation of results. The quality of the data and of our assumption about the process under investigation form the basis for the environmental applica-tion. A small blunder can set off a chain of errors that can go undetected for long periods. And when they do get noticed, it takes quite some time to recognise the source of the issue, and even longer to correct it.

Notwithstanding the complexity of terrain analysis and DEM applications, the use of digital terrain analysis offers interpretative, analytical investigations on past and current patterns of processes, and can help improve and possibly prediction earth surface processes for the future. This unique set of tools and techniques provide field evidence of changes to the landscape in response to various drivers, of natural or anthropogenic origin. Thus, they enhance our under-standing of changes in geomorphic systems and vulnerabilities to landscapes and society at a variety of scales, from micro, to local, to global.

297

References for further reading

298

3.5 Deformation measurements based on point clouds

Daniel Wujanz

3.5.1 Error budget of deformation measurement based on UAVs .............................................. 2983.5.2 Aliasing in 3D-data acquisition .............................................................................................. 2993.5.3 Strategies for deformation measurement based on point clouds....................................... 2993.5.3.1 Model-free deformation measurement ................................................................... 3003.5.3.2 Segmentation-based deformation measurement ................................................... 3053.5.4 Summary and open issues ....................................................................................................... 306Geomorphometric computations rely on physical measurements of the Earth’s surface that ulti-mately form the basis for understanding geomorphological processes. Despite the roots of this approach can be traced back to the early 1800s (Pike, 2002) it took more than a century until the geodetic strategies for data acquisition and processing emerged that are still being used today. Presumably the first realisation of geomorphometric measurements were carried out in the field of engineering surveying (Ganz, 1914) and is also referred to as change detection, deformation measurement, deformation monitoring or deformation analysis in the field of geodesy. In the following the term deformation measurement is used since the primary aim of this process is to quantify and finally visualise deformations. The analysis itself is an interpretational task which is finally carried out by an expert in the respective field – while the computation of deformations requires geodetic knowledge.

Deformation measurement is conducted by surveying an area of interest at different points in time, that are referred to as epochs. Geometric changes are then identified based on the captured data. Therefore, a stable reference frame is required which is determined by immovable control points – a methodology that is referred to as congruency modelling (Heunecke & Welsch, 2000). In order to achieve this prerequisite, both stable areas and those potentially subject to deformations need to be identified within the area under investigation. If stable points, e.g. in the case of tacheometry, or areas,

299

when using point clouds (Wujanz et al., 2018), have been detected, they can be used to transform a given epoch into a reference epoch for deformation measurement. As a final step, differences between the point clouds are computed that eventually reveal deformations. The outcome is generated by colour-coding the points of one dataset in dependence to their magnitude of deformation that can finally be used to draw conclusions about the geomorphometric behaviour of an area of interest.

The process chain of deformation measurement typically involves the following steps, regard-less which strategy for data acquisition was chosen:

• Planning of a survey

• Acquisition of an epoch

• (Geo-) Referencing the data, see chapter 2.1 for details.

• Quantification of deformations

Sound summaries about deformation monitoring based on point clouds can be found in e.g., Jaboyedoff et al. (2012), Lindenbergh & Pietrzyk (2015) or Wujanz (2016). 3.5.1 Error budget of deformation measurement based on UAVs

Perfection is unfortunately just a theoretical concept and therefore beyond reach in practice – regardless of if we look at the accuracy of sensors, algorithms or computations performed by a computer. Hence, it is vital to consider potentially all relevant error sources that interfere with the desired outcome which is gathered in a so-called error budget (Soudarissanane, 2016). The more realistic the occurring errors can be estimated, the more realistic deformations can be distinguished from random or systematic errors that are provoked along the path of data pro-cessing. An error budget for an unmanned aerial vehicle that is applied for deformation meas-urement may contain, among others, the following components:

• Accuracy of the applied sensors for 3D-data acquisition (see e.g. chapter 2.3), positioning, and orientation (see chapter 2.1)

• Bore-sight calibration of all sensors (Jutzi et al., 2014)

• Registration/referencing of point clouds

• Sampling process/aliasing, see chapter 3.5.2

• Quantification of deformations, see chapter 3.5.3

Particularly critical aspects will be discussed in greater detail in the following.

300

Th e point sampling of all 3D-data acquisition techniques varies in dependence to the survey con-fi guration, for instance fl ight altitude and relative orientation to the object of interest, as well as the selected sensor settings, such as the chosen resolution or frequency of data acquisition. Every 3D documentation of an object can thus be interpreted as a coherent geometric representation – a direct comparison with other descriptions of the same object, however, usually leads to pseu-do-deformations. Th is is eff ect is illustrated in Figure 3.5-1. Th e left part of the fi gure shows three diff erent geometric descriptions of an identical object. For the sake of clarity, a single profi le of each point cloud is shown on the right. If we now triangulate the point clouds and look sideways at the resulting profi les, apparent diff erences emerge, especially in unsteady and poorly resolved areas. 3D-data acquisition techniques can therefore be interpreted as polymorphic measuring methods. Figu re 3.5-1: Genesis of pseudo-deformation as a consequence of aliasing (Wujanz, 2018;

Copyright VDE Verlag; Used with permission – all rights reserved). 3.5. 3 Strategies for deformation measurement based on point clouds

Point clouds that are enriched with temporal information can be interpreted and hence pro-cessed in several ways. Potential processing strategies can be based on:

301

• model-free assumptions, which is subject of chapter 3.5.3.1

• computed segments, as discussed in chapter 3.5.3.2, or

• an assumed model.

A peculiarity of the last-mentioned strategy is that changes of an object over time are not com-puted based on the original point cloud or subsets. Instead, the captured points are approximated with respect to an assumed model which characterises an object’s shape. Deformations are given if differences among estimated parameters between epochs are of statistical significance. Even though this strategy could be useful for geomorphometry, for instance to characterise changes in steady landscapes such as dunes or snow cover, it will not be discussed in greater detail. This can be explained by its early scientific stage where current research primarily focuses on mon-itoring manmade structures (e.g. Holst et al., 2019). The following two sections will discuss the remaining strategies in detail. 3.5.3.1 Model-free deformation measurement

If no information is available regarding the assumed geometric shape of a point cloud or parts of it, it is referred to as model-free deformation measurement. Since this is the majority in most cases, model-free deformation measurement can be assumed to be the standard case. In prin-ciple, several procedures of model-free deformation measurement can be differentiated, mostly with regard to the chosen strategies of forming correspondences. The crux of this task arises from the already mentioned quasi-laminar characteristics of point clouds (Wujanz, 2016, p.1), which means that aliasing is inevitable, and that no repeatedly observable points can be record-ed at different points in time. Therefore, the determined deformation vectors are of a purely interpretive nature and do not necessarily correspond to the physical direction of action of an occurred deformation since they cannot be assigned to a semantic object or specific points on the object. The following section describes the two most common methods to establish point correspondences which can be found in several implementations. Subsequently, three widely distributed algorithms for deformation measurement, that are all publicly available in Cloud-Compare (Girardeau-Montaut, 2011), will be discussed. The correspondence problem

The key prerequisite for referencing and computing deformations based on point clouds is the formation of point correspondences. The simplest solution to achieve this is referred to as point-

302

to-point correspondences, in which point pairs are established based on the minimal distance between two point clouds. Examples are for instance Besl & McKay (1992) in the context of reg-istration of point clouds and Girardeau-Montaut et al. (2005) for deformation measurement in laser scans. Figure 3.5-2 illustrates this concept, where points of a reference epoch are highlight-ed in green. In the left part of the fi gure, the red sphere indicates data from a subsequent epoch that were captured by the depicted scanner. Th e closest point between the reference epoch and the succeeding epoch is interpreted as the corresponding point, which is highlighted in yellow in the centre of the fi gure. Th e vector between the corresponding points can be interpreted as the geometrical diff erence respectively deformation that may occurred, which can be seen in the right part of the fi gure in form of a yellow line.

Th e main problem of this approach is that the computed geometrical diff erence based on point-to-point correspondences directly depends on local diff erences of point sampling of the point clouds and thus on the present resolution. Another disadvantage of this approach is that no distinction can be made regarding the sign of the deformation. In practice, this means that geometrical diff erences between epochs can be computed, yet it is not clear whether, for exam-ple, a material gain, or loss has occurred.

Figur e 3.5-2: Th e concept of point-to-point correspondences (based on Wujanz, 2018).

To compensate for the inevitable eff ect of aliasing in capturing point clouds an alternative ap-proach can be used to approximately solve the problem. Th erefore, point-to-triangle corre-spondences are established, as suggested by Chen & Medioni (1992) for registration of point clouds and Cignoni et al. (1998) when determining deformations between epochs. Th e general concept is depicted in Figure 3.5-3. While the starting points are geometrically identical to the ones shown in Figure 3.5-2, the reference epoch was triangulated. Correspondences are estab-lished when a point from a subsequent epoch can be projected onto a triangle of the reference epoch. Th e vector between the base point and the point from the successive epoch, which is highlighted by a yellow line in the centre part of the fi gure, can be interpreted as a deformation. Consequently, its orientation determines the direction of action which, as in the case of point

303

of point-to-point correspondences, does not necessarily coincides with the physical one that occurred in the object space.

While this strategy at least compensates for diff erences in local sampling – it also creates new problems as shown on the right of the fi gure. For the present example, another point was added, which consequently leads to an additional triangle. Since the point from the successive epoch can be projected onto both triangles, an ambiguity results since its correspondence cannot be clarifi ed with certainty. Th is problem typically appears in unsteady landforms, e.g., block gla-ciers. In theory, the use of the normal direction allows to diff erentiate the sign of a deformation, i.e. whether it is a positive or negative change. If a point is located on the side to which the nor-mal vector is pointing, this point receives a positive sign and could therefore be interpreted as a gain of matter or a motion towards the sensor. Since information concerning the survey confi g-uration is typically lost in the process of registration of all point clouds and the unifi cation into a common data set, an arbitrary orientation of the surface normal usually occurs. Th e essential information as to whether an area is subject to growth or loss can therefore only be determined by common sense or expert knowledge.

A routine, which is commonly used in practice, is to reduce the point density of the reference point cloud to save time during its triangulation. Th is course of action should be avoided at all costs since it most probably ‘artifi cially’ creates aliasing eff ects and consequently pseudo-defor-mations.

F igure 3.5-3: Th e concept of point-to-triangle correspondences (based on Wujanz, 2018;

Copyright VDE Verlag; Used with permission – all rights reserved). Computation of deformations

An ever-emerging problem in engineering is the question whether a given signal, in our case a geometric diff erence between point clouds, is notably larger than the given noise. In the fi eld of Geodesy, the resulting noise level of an entire process chain is typically computed by means of error-propagation (Schaer et al., 2007; Mezian et al., 2016; Ghilani 2017) based on the afore-

304

mentioned error budget. If the ratio between signal and noise is of statistical significance (Te-unissen et al., 2020) it can be assumed that deformation has occurred in between epochs. In the following we will have a look at how common algorithms for deformation measurement based on point clouds determine whether a geometric difference between point clouds is considered being deformation or noise. The final step of deformation measurement based on point clouds is the visualisation of the results which forms the basis for its interpretation. Therefore, points or triangles of one epoch are colour-coded in dependence to the magnitude of an associated deformation.

The two  most widely deployed approaches for deformation measurement based on point clouds in science and industry are the Metro algorithm, as suggested by Cignoni et al. (1998), as well as Girardeau-Montaut et al. (2005). Both approaches do not consider an error budget at all which consequently means that deformations are distinguished from noise by setting fixed boundaries. Consequently, the colour scaling of the final results can be freely adjusted. As a result, arbitrary results can be generated. A sound statement as to whether a statistically signif-icant deformation is present or not cannot be made. A detailed and recommendable study on algorithms for model-free deformation measurement based on point clouds was carried out by Holst et al. (2017).

To overcome the aforementioned drawback and to verify the statistical significance of defor-mations, Lague et al. (2013) determine stochastic measures for local point adjacencies in their Multiscale Model to Model Cloud Comparison (M3C2) algorithm. For this purpose, a cylinder diameter must be defined which determines the circumference in which points are considered for the calculation of the stochastic measures. The calculated numbers are finally assigned to so-called core points – the distance between core points from two epochs represent the ge-ometric difference between point clouds. A statistical test assesses if the geometric difference is a significant deformation or not. Although this algorithm can be regarded as the most mature one among the publicly available solutions, it still deploys a rather simple error budget. Frankly speaking, the consideration of an applied sensor’s uncertainty for 3D data acquisition is a very challenging and thus ongoing research question, especially in the field of laser scanning (Wujanz et al., 2017; Heinz et al., 2018). Hence, the given algorithm could be extended as soon as the corresponding research has reached a matured state. An error component that easily exceeds the uncertainty of 3D data acquisition is the influence of referencing/registration. This effect is however only considered rudimentarily in the M3C2 as a global error component while it is well known that this error is non-isotropic which means that its impact is not equally distributed within referenced point clouds – just as in any surveying technique (Ghilani, 2017).

In the following, a practical example will be processed by two different solutions that were previously discussed. The data, which was thankfully provided by the first author of Al-Rawab-deh et al. (2017), consists of two epochs featuring a landslide in Canada and was captured by a

305

UAV-mounted camera. Th e most active area can be seen in the upper half on the left of Figure 3.5-4. At fi rst, the two referenced scans were processed by using a point-to-point-based strategy. Th e upper boundary was set to 3 m and hence restricts the largest distance between two points from diff erent point clouds. Consequently, this setting also limits the largest magnitude of a deforma-tion. Th e other end of the spectrum was set to 5 cm which means that all geometric diff erences smaller than this number are considered to be geometrically stable. Since no information about the local orientation of individual points is given, it is not possible to distinguish between, e.g., subsidence or heave. Th e centre part of the fi gure shows all points that are considered as being subject of deformation while the colour bar is given in metres. It is obvious that increasing the lower threshold would reject many of the shown points. From an interpretational point of view, a change of this global threshold would lead to the conclusion that this landslide and its surround-ings is less active compared to the shown result. Th e right part of Figure 3.5-4 shows the outcome generated by the M3C2 using suggested parameters. Note that only core points are shown which is why the point density on the right is lower than the one in the centre. Since every core point also receives a computed face normal it is possible to distinguish diff erences in the direction of a deformation. Th us, the colour bar ranges from +3 m to -3 m. Comparing the fi gures in the centre and on the right do not reveal obvious diff erences in the upper range of deformations. However, the lower end of the spectrum diff ers notably and thus has a great impact when, e.g., areal chang-es are reported. Th is inconspicuous circumstance is of particular brisance when numbers are reported for areas of public interest, such as the retreat of a glacier as a consequence of changing climatic conditions or the featured landslide, that occurred in close proximity to a residential area. We should be aware that numbers are powerful information that can, as history has shown numerously, can be misused to discredit scientifi c evidence or measurements that are not in ones’ favour in general. Th us, it should be our self-interest to produce sound and reproducible num-bers – regardless of which sensors and algorithms were used.

Fi gure 3.5-4: RGB-image of the reference point cloud (left ), deformation maps based on point-to- point correspondences (centre) and the M3C2-algorithm (right) (Wujanz, 2018;

Copyright VDE Verlag; Used with permission – all rights reserved).

306

The outcome of model-free deformation measurement approaches allows to draw geometrical conclusions. However, in many fields of application, it is desirable to analyse deformation pat-tern or changes of an object at the level of single objects or distinguishable areas with different properties (Anders et al., 2021). This step requires a semantic layer and can be achieved by segmenting or classifying the input point clouds (Brodu & Lague, 2012; Poux, 2019). Up to now there is no proper definition of how to call these rather novel approaches that operate at segment level. Therefore, the term segmentation-based deformation measurement will be used throughout this section. Note that this strategy is ambiguous and can therefore be interpreted in several ways depending on how subsets of the original data are created and how changes are determined.

A prerequisite for this deformation model is a preceding segmentation or classification of the input point clouds based on geometric and/or radiometric information, for instance in form of intensity values captured by a laser scanner or RGB values from imagery. The result of this ele-mentary pre-processing step corresponds to an object generation, where each point of the input data is assigned to exactly one segment. Consequently, the original point clouds are divided into subsets of points with equal characteristics. Typical classes in geomorphology could be for instance bedrock, vegetation, boulders, or deposit.

Geomorphometric measures, such as velocity and magnitude, are of great interest for geo-scientists and can be derived by the algorithms described in the previous section. Yet, they are not capable to derive another vital measure namely the direction of action. Gojcic et al. (2018) addresses this drawback and thus present one possible interpretation of segment-based defor-mation measurement. Starting point of this algorithm is the segmentation of single boulders from the original point cloud. Subsequently, point-to-point correspondences between individ-ual points on an object captured at different times are formed with the help of a local feature descriptor. Finally, the direction of action as well as the magnitude of the deformation can be determined on the basis of the determined correspondences.

Mayr et al. (2019) classify point clouds of a recorded landslide, as depicted on the left of Fig-ure 3.5-5, into seven geomorphological classes (right part of the same figure) which provides the basis for a second interpretation of this strategy. Therefore, a combination of a supervised clas-sification and a rule-based re-classification using object knowledge is used (Mayr et al., 2017). Deformations between different epochs are determined by an approach comparable to Lague et al. (2013). The generated semantic information allows a) to assign deformations to one of the seven geomorphological classes and b) to detect local changes regarding the classification outcome.

307

Figure 3.5-5: Documented landslide (left) and segmented point cloud (right) (Mayr et al., 2019). Reproduced with permission from the American Society for Photogrammetry and Remote

Sensing, Bethesda, Maryland, www.asprs.org. 3.5.4 Summary and open issues

Classical deformation measurement based on established surveying techniques, such as level-ling or tacheometry, has always been a domain of a few highly specialised experts in the field of engineering geodesy for. The emergence of affordable and comparably precise sensors for con-tactless 3D-mapping and available algorithms for post-processing however initiated an inverse trend – deformation measurement can now be considered a mainstream tool that is carried out across many fields of specialisation. A downside of this development is that numerous studies appear to be carried out in the haze of superficial knowledge. Many users simply do not know how the applied algorithms work, how tuneable parameters influence the outcome, are not aware of error sources and how to consider them. Another problem is that the vast majority of scientific output about deformation measurement based on point clouds is just deploying exist-ing software, while only a homeopathic fraction of publications suggests new implementations that account for known and critical shortcomings of existing ones. Even though 3D-mapping and processing has turned into a profitable market and is vividly used across many scientific disciplines, very few universities added this subject to their curriculum which could help to address these issues. Despite all existing problems deformation measurement based on con-temporary 3D-mapping data is a giant leap forward in terms of spatial information density as well as its quality and thus provides a very powerful tool to geoscientists in their quest of re-vealing a deeper understanding in geomorphological processes (Zahs et al., 2020). Apart from very promising results generated by segmentation-based approaches recent research effort also

308

started to address the vital issue of error estimation, see for instance Fey et al. (2107), James et al (2017) or Mayr et al. (2020).

References

310

4. Applications

311
312

4.1 UAVs in geology

Moritz Kirsch, Sandra Lorenz, Robert Zimmermann, Yuleika Madriz,

Robert Jackisch, Samuel T. Thiele and Richard Gloaguen

4.1.1 Geological analysis using UAV-based photogrammetry .................................................... 3144.1.2 Geological analysis using UAV-based hyperspectral imaging ........................................... 3164.1.3 Geological analysis using UAV magnetics ............................................................................ 3184.1.4 Outlook and conclusions ........................................................................................................ 320Geological analysis is crucial for successful resource exploration, natural hazard assessment, infrastructure development and the scientific investigation of the Earth’s history. However, the structural and compositional complexity of many geological terranes, combined with often lim-ited exposure, make such analyses a challenging task. To overcome these difficulties, mapping ef-forts typically synthesise diverse datasets gained directly from geological outcrops and indirectly via geophysical methods that can measure physical properties of the subsurface.

UAVs provide an accessible, cost- and time-efficient tool for acquiring high-resolution, mul-ti-sensor, multi-temporal and multi-perspective data, and so are increasingly used for geological purposes. A single UAV survey can cover areas of up to 10’s of square kilometers at a cm-dm spatial resolution, bridging the scale gap between ground-based geological field work and air-borne mapping campaigns. UAVs can also carry geophysical sensors that provide surface and subsurface lithological and structural information in areas that may be cumbersome, difficult or dangerous to access by traditional means.

Table 4.1.-1 gathers a non-exhaustive list of UAV sensors currently/potentially used for geo-logical applications. In both geological research and the mining industry, the most widespread application of UAVs is the generation of digital outcrop models by Structure-from-Motion Mul-ti-View Stereo (SfM-MVS) photogrammetry. These models can then be used to map structures or lithologies, capture fracture data for geotechnical analysis or fluid flow modelling, and moni-tor slope stability or raw material production. Magnetic and hyperspectral sensors are also being

313

deployed on UAVs as emerging tools for exploration targeting, while thermal cameras and gas sensors are becoming prominent in the volcanology community (covered in the chapter on the application of UAVs in volcanology) and for hydrogeological applications.

Apart from these more common uses, there are few reports on the use of UAVs for surveys with miniaturized versions of geophysical sensors usually mounted on helicopters or airplanes such as light detection and ranging (LiDAR), very-low-frequency electromagnetics (VLF-EM), full tensor magnetic gradiometry (FTMG), and ground-penetrating radar (GPR). LiDAR sen-sors mounted to UAVs provide an alternative way of generating topographic data with the added benefit of being able to penetrate vegetation. UAV-borne VLF-EM sensors are a recent, more flexible adaptation of the airborne VLF-EM counterparts that have been used for environmental and exploration purposes since the 1960s. VLF-EM utilizes distant transmitters broadcasting at frequencies in the range of 15–30 kHz to map resistivity contrasts to depths of ca. 100 m below the surface. FTMG systems provide measurements of the full magnetic gradient tensor of the Earth’s magnetic field, allowing the resolution of deep, small or weakly magnetic targets and a calculation of the remanent magnetisation vector. GPR-drone integrated systems are flown at low elevation and low speed to produce high-resolution, three-dimensional imaging data of the near surface (e.g. for soil-layer profiling) based on reflections of high-frequency radio waves induced into the ground. Additionally, UAV-mountable gravimeters have recently been developed, which should enable a more efficient solution for mapping density contrasts of rocks underground. However, for all the sensors mentioned in this paragraph, no geologic case stud-ies have been published to date. This chapter will therefore focus on photogrammetry-derived digital outcrop models, hyperspectral imaging and magnetic surveys, describing challenges and best practices in terms of acquisition, processing and interpretation of these data for geological purposes.

Table 4.1-1: Non-exhaustive compilation of UAV-based sensors and their applications in geological research.

One of the most common UAV-based datasets utilized for geological purposes are high resolu-tion photographs that can be transformed into digital outcrop models by means of SfM-MVS photogrammetry (e.g., Ullman, 1979; Seitz et al., 2006; Westoby et al., 2012; James & Robson, 2012). Digital outcrop models are virtual representations of geologic outcrops that consist of either large point clouds or photo-textured meshes. These models provide a rapid and objective way of capturing outcrop information at sub-cm resolution over wide areas. Hence, they are an ideal dataset for lithological, stratigraphic and structural mapping (Bemis et al., 2014; Nesbit et al., 2018; Dering et al., 2019), erosion and rockfall monitoring (Vanneschi et al., 2019; Menegoni et al., 2019), reservoir characterization (Priddy et al., 2019) and open pit mine surveying (Chen et al., 2015; Ren et al., 2019).

Numerous software packages and algorithms have been developed for the purpose of vis-ualizing and manipulating digital outcrop models (OpenPlot—Tavani et al., 2011; VRGS—Hodgetts et al., 2015; LIME—Buckley et al., 2019; CloudCompare—Girardeau-Montaut, 2011). These software packages can also be used to interpret datasets and extract structural measurements, though this can be a very time-consuming process for large, high-resolution models. A variety of automatic and semi-automatic methods are beginning to emerge to help optimise the interpretation process and improve objectivity and reproducibility (e.g. Vasuki et al., 2014; Dewez et al., 2016; Thiele et al., 2017, 2019; Guo et al., 2018). The natural variability and multi-scale nature of geological structures makes this a challenging task, so there is signif-icant scope for new developments. Using these (semi-)automatic methods, unprecedentedly detailed datasets can be extracted (Figure 4.1-1), such as the planar orientations of lithologic

316

contacts, faults, veins, and joints (Figure 4.1-1B), and derived measurement such as dike or layer thickness (Figure 4.1-1B), fracture spacing, density, and persistence. Th ese data can then feed into 3D models (Figure 4.1-1C; e.g., Bistacchi et al., 2015; Hansman & Ring, 2019), e.g., for visualization, volume calculations, kinematic restauration, reservoir and geomechanical modelling.

Figure 4.1-1: Example digital outcrop model and interpreted stratigraphy (a) acquired by conducting a UAV survey of a cliff face in Caldera Taburiente, La Palma, Spain. Dykes and sills were mapped using the Compass plugin in CloudCompare (Th iele et al., 2017) and a large number of orientation and thickness measurements (b) extracted using the method described by

Th iele et al. (2019). Th ese data were then used to constrain a 3D reconstruction of the shallow volcanic plumbing system (c). Prepared by the author for this chapter.

Besides serving as photo-realistic 3D basemaps for the analysis of geologic structures, digital outcrop models and digital elevation models derived from UAV-based imagery are used for the topographic correction of hyperspectral imagery. Furthermore, the digital outcrop models can be fused with hyperspectral data and their derivatives (e.g., Lorenz et al., 2018; Kirsch et al., 2018, 2019) for improved interpretability of material properties and delineation of lithologic contacts.

317

4.1.2 Geological analysis using UAV-based hyperspectral imaging

Each pixel of a hyperspectral image (HSI) contains a continuous spectrum over a certain wave-length range. The spectrum is material specific, and thus, in geological contexts, yields informa-tion on mineralogical composition. Current UAV-borne hyperspectral sensors cover the VNIR (0.4–1.0 μm) and SWIR (1.0–2.5 μm) range of the electromagnetic spectrum, in which elec-tronic processes and molecule vibrations cause characteristic absorption features for a variety of common geologic materials. This includes iron oxides, iron hydroxides, and iron sulfates as well as rare earth elements in the VNIR, and “alteration minerals”, such as phyllosilicates, hy-droxylated silicates, sulphates, carbonates, and ammonium minerals in the SWIR (e.g., Hunt, 1977; Pontual et al., 1997). These minerals can be identified and characterised in HSI using band ratios, minimum wavelength mapping, dimensionality reduction, mineral mapping/unmixing, and unsupervised or supervised classification (e.g., Contreras Acosta et al., 2019).

Acquisition routines for UAV-based hyperspectral imaging differ between frame-based and push-broom cameras. Frame-based systems can be operated in nadir or off-nadir setup (be-cause corrections are accomplished through co-registration of bands within individual data cubes and subsequent georeferencing to an orthophoto). As images are acquired band-wise, it is recommended to fly at low speed (the exact speed depends on flight height) or in a stop-and-go mode to maximize the spatial overlap between bands. Push-broom sensors are best operated in automatic mode to minimize deviations from a nadir viewing angle. IMU data are essential to allow correction for roll, pitch and yaw of the drone in post-processing. Calibration panels with known spectral signatures should be placed in at least one of the scenes for radiometric correc-tion of hyperspectral data (Figure 4.1-2B).

Drone-borne hyperspectral imaging requires a specific sequence of pre-processing steps to transform the raw data to meaningful spectroscopic information, including lens correction and band co-registration, conversion to radiance, orthorectification and georeferencing, and topo-graphic correction. Whereas the conversion to radiance can usually be accomplished using pro-prietary software of the camera manufacturer, the other pre-processing steps can be realized with the Mineral Exploration Python Hyperspectral Toolbox (MEPHySTo, Jakob et al., 2017). Ortho-rectification, georeferencing and topographic corrections require a corresponding digital eleva-tion model or digital outcrop model of the imaged scene, which is usually acquired separately, but can also be obtained from the hyperspectral images themselves if the overlap is sufficient.

Multi- and hyperspectral imaging for geologic mapping require an unobstructed view on the outcrops. Vegetation-free outcrops are more common either in arid or arctic environments, which are challenging regarding the operation of drones due to extreme weather conditions

318

and their remoteness. In more vegetated parts of the world, rocks are usually only exposed in sub-vertical outcrops (or underground), either in natural settings such as canyons or river val-leys, or in artificial settings like road construction or open-pit mines. These circumstances often require a non-nadir sensor setup, which adds substantial complexity to acquisition and data pro-cessing routines. Hyperspectral imaging in the VNIR–SWIR region is a passive technique that requires an external light source, which, in outdoor settings, corresponds to reflected sun-light. As the light-scattering effect of moisture and dust can have a detrimental effect on the quality of spectroscopic data, hyperspectral surveys are best conducted in dry conditions at bright day-light, which can be a limiting factor in high latitude areas, in sites of high topography and in active mining environments.

Figure 4.1-2: UAV-based hyperspectral imaging of a gossanous ridge in the Rio Tinto area, southern Spain. (A) Rikola hyperspectral VNIR camera and (B) calibration panels. (C) Vegetation masked false colour image (Principal Components 2, 3 and 5) draped on a 3D orthophoto model,

(D) image spectra (figures C and D modified after Jakob et al., 2017 . Originally published under a CC BY license (https://creativecommons.org/licenses/by/4.0/)).

Validation or ground truthing is an essential step in geologic remote sensing, as it allows the accuracy assessment of the mapping results with error metrics, and more importantly, allows the identification of real surface features such as orientation and lithologies. Likewise, ground validation helps to increase supervised classification accuracies, provides means for atmospheric

319

correction and geolocation by global navigation satellite system positioning. Methods of acquir-ing ground truth data are manifold, and range from traditional surface photography and speci-men sampling with ensuing lab analysis, to the use of modern, portable analytical devices such as X-ray fluorescence (XRF) spectrometers, VNIR-SWIR spectroradiometers, Fourier transform infrared (FTIR) analysers, and handheld laser induced breakdown spectrometry (LIBS) instru-ments. Further validation sources include satellite imagery and spectral libraries.

Hyperspectral data can be combined with SfM point clouds to produce HSI-enhanced digital outcrop models (Lorenz et al., 2018; Kirsch et al., 2018, 2019), which provide a three-dimension-al, distortion-free framework for intuitive geological outcrop visualisation and analysis. Within a 3D framework, these models can be used to delineate geologic contacts and structures as well as incorporate spatially referenced analytical validation data during interpretation. 4.1.3 Geological analysis using UAV magnetics

Airborne and ground based magnetic surveys are widely used in mineral exploration, particu-larly in situations where there is limited outcrop. Magnetic data can be used to directly detect magnetically anomalous mineral deposits (Figure 4.1-3), and indirectly to identify geologically favourable sites for potential mineralization. Magnetic data are interpreted in conjunction with geological data to establish a link between anomalies and their source location, depth and ge-ometries (e.g., Isles & Rankin, 2013). Recent studies (Naude & Kumar, 2017; Jackisch et al, 2019) have demonstrated that UAV-based magnetic surveys can provide high-quality magnetic data at a lower cost than labour-intensive ground surveys, and with a minimal environmental footprint. Whereas traditional aeromagnetic surveys are useful for regional reconnaissance mapping, low altitude and dense UAV-based magnetic surveys are well suited for more detailed targeting as they have the ability to resolve small, shallow anomalies.

For acquisition of UAV-based magnetic data, it is recommended to follow the guidelines for aeromagnetic surveys given by Reid (1980) and Coyle et al. (2014). Because the response of a magnetic body falls off with the inverse cube of the distance, UAV magnetic surveys should be flown at low altitude to maximize resolution. Survey lines should be oriented perpendicular to the strike of the geological target to enhance the geological contacts detected, and line spacing chosen to resolve the smallest features of interest. Tie lines should be flown to enable tie line levelling corrections as a means to eliminate line-to-line errors. All magnetic surveys also re-quire a ground-based magnetometer to be located in a magnetically quiet zone near the survey operation to measure and correct for diurnal variations of the magnetic field. Furthermore, a compensation test or calibration Figure 4.1-3 is advised, so post-processing adjustments can be made in account of the directional variations of a given magnetometer.

320

Electromagnetic interference produced by UAVs can compromise magnetic data quality. In order to avoid electromagnetic noise from the engines and payload electronics, the sensor should be placed at a distance from these sources, either in the tail end of a fixed wing drone (Figure 4.1-3B) or attached/towed underneath a multirotor UAV (Figure 4.1-3A). To prevent sudden changes of current to the engines, flights are best conducted in windless conditions, at constant barometric height, and in automatic flight mode. A gimbal can reduce artifacts in the measured magnetic field due to attitude variations of the UAV. Sources of cultural noise (e.g., power lines, railways, electric fences, radio towers, etc.) should be avoided.

Figure 4.1-3: Results from a UAV-based magnetic survey at Otanmäki, Finland (figures modified after Jackisch et al., 2019. Originally published under a CC BY license (https://creativecommons.org/ licenses/by/4.0/)). (A) Multicopter with SenSys MagDrone R1 fluxgate magnetometer. (B) Fixed- wing drone from Radai Oy. (C–E) Total magnetic intensity plots with survey lines and (F) geological map (black line delineates outcrop). Note differences in resolution of magnetic data captured at different elevations and good correlation between high magnetic intensities and mapped iron ore.

321

Processing of UAV magnetic data involves the removal of heading errors, diurnal variations, interference by the magnetic field of the UAV, compensations for the sensor movement and tie-line levelling. A commercial toolbox, i.e. the Seequent UAV Geophysics Extension for Oasis Montaj, optimized for specific types of sensors, is able to handle most of these operations. Re-gional-residual separation is a crucial step in the interpretation of magnetic data to constrain the distribution of the magnetic response. Since the total field is the result of all sources below the sensor, this analysis allows a differentiation between deeper and shallower anomalies based on the fact that the regional field spectrum is dominated by low frequencies that come from larger and deeper sources while the residual field is dominated by high frequencies that come from small and shallow sources.

Magnetic data are usually presented as maps of total magnetic intensity (TMI). Common enhancements for the interpretation of magnetic data include (a) the reduction-to-pole (RTP) transform, which removes the asymmetry of magnetic anomalies where the Earth’s magnetic field is non-vertical, (b) vertical and horizontal derivatives that highlight discontinuities, and (c) analytic signal, in which all three directional gradients of the magnetic field are combined to help delineate geological bodies and resolve close-spaced bodies. TMI maps and derived data-sets can then be interpreted to define geophysical domains, which can be correlated with other geophysical datasets (e.g. gravity) and geological constraints (e.g., Isles & Rankin, 2013). If the magnetic properties of rock units are known, then forward and inverse modelling techniques can be applied to gain further insight into the three-dimensional geological structure and test specific geometrical hypotheses. 4.1.4 Outlook and conclusions

UAVs have become an important, if not essential, tool in the study of geological targets. Re-motely sensed datasets provide indispensable information on the topography, structure, and main mineral compositions of an area. This has enhanced the way conventional geological work is performed. By using multiple UAV sensors, geologists are able to map inaccessible areas, improve existing geological maps, and acquire valuable geological data rapidly and safely. Currently, academic efforts are focussed on efficient ways to combine multi-sensor data. There is an urgent need for innovative data processing methodologies (Artificial Intel-ligence [AI], Machine Learning [ML]) for exploiting the data acquired by UAV platforms at multiple spatial and temporal scales. These approaches will build the foundation of future predictive tools.

322

References for further reading

324

4.2 UAVs in geomorphology

Dirk Hoffmeister, Andreas Kaiser and Anette Eltner

4.2.1 Fluvial geomorphology ........................................................................................................... 3244.2.1.1 Bathymetry .................................................................................................................. 3254.2.1.2 Granulometry ............................................................................................................. 3264.2.1.3 Change detection ........................................................................................................ 3274.2.1.4 River habitats .............................................................................................................. 3284.2.2 Soil erosion ................................................................................................................................ 3284.2.3 Gravitational processes ............................................................................................................ 3294.2.4 Tectonics .................................................................................................................................... 3304.2.5 Marine and coastal applications ............................................................................................. 331Geomorphology in general is the scientific area that examines the origin and evolution of the Earth surface and its landforms. It focuses on the landscape forming processes and magnitudes. In this context, UAVs generally enable to detect and document features. Most importantly, pro-cesses of change can be monitored much more easily with unprecedented spatial and temporal resolution. Therefore, UAVs allow for a paradigm change in geomorphologic measurements. The field of geomorphology is closely related to geomorphometry and often uses the analy-sis of DEMs (chapter 3.4). Likewise, geology (chapter 4.1), hydrology (chapter 4.3), as well as processes in the cryosphere (chapter 4.5) and volcanology (chapter 4.6.) are parts of or closely related to this scientific area. Thus, in this chapter the major fields of UAV applications in a geomorphologic context are described (see also Figure 4.2-1) in the following, namely areas of fluvial geomorphology, erosion, gravitational processes, tectonics and from the area of coastal to marine applications.

325

Figure 4.2-1: Illustration (by Melanie Elias) of the UAV applications to observe geomorphological processes described in this chapter. (A) fluvial morphology (chapter 4.2.1) considering A1 bathymetry and granulometry, A2 change detection and A3 river habitats. (B) Erosion

(chapter 4.2.2). (C) Gravitational Processes (chapter 4.2.3). (D) Marine and coastal applications (chapter 4.2.5). Prepared by the authors for this chapter. 4.2.1 Fluvial geomorphology

The repeated observation of rivers is important to assess the frequency and magnitude of flood events and to measure morphological controls on the impact of events. Furthermore, monitor-ing is needed for anthropogenic management of rivers, e.g. to control discharge, to evaluate the impact of channel changes and to measure the quality of aquatic ecosystems. The resulting ob-servation data is also implemented in numerical models, for instance, to predict future flooding areas.

Airborne Lidar and traditional photogrammetry by airplanes are used to reconstruct the flu-vial topography and bathymetry for geomorphological process understanding (Lane, 2000; Lane et al., 2003; Legleiter, 2012). However, these methods are expensive and less flexible, for instance, in the situation when the river has to be mapped immediately after a flood. On the ground to-tal stations are applied to acquire river cross-sections, or terrestrial laser scanners are used for

326

high-resolution topography data of a finite area (Baewert et al., 2014; Brasington et al., 2012). The former approach only allows for very low spatial resolutions and the latter is only suitable for smaller extents. Satellites are used to measure rivers, as well, but are less suitable for intermit-tent scales due to their coarser spatial resolution (Spence & Mengistu, 2016).

UAVs enable very frequent measurement of rivers at unprecedented scales (Carrivick & Smith, 2019). First applications of UAVs in fluvial geomorphology were the reconstruction of the to-pography of gravel bars and the bathymetry of rivers in France using paragliders (Lejot et al., 2007) and the automatic mapping of river corridors allowing the UAV to autonomously detect and track rivers (Rathinam et al., 2007). Overall, main areas of application are bathymetry, gran-ulometry, change detection and river habitat assessment, which are discussed in more detail. 4.2.1.1 Bathymetry

Bathymetry describes the topography beneath the water surface. Bathymetric mapping can be performed with different approaches. The two most often applied techniques are either em-pirically, linking the attenuation of the radiation signal in the images with water depth, or ge-ometrically, modelling and correcting the refraction impact and reconstructing the underwater area with SfM (chapter 2.2). The full 3D reconstruction of a river is possible from UAV data, including bathymetry and topography as well as flow velocity measurements (chapter 3.3) ena-bling even discharge estimation and therefore allowing for comprehensive hydromorphological monitoring (Cândido et al., 2020; Detert et al., 2017).

The acquisition of measurements below the water surface requires the compensation of the refraction, either with a simple correction factor suitable for applications with Nadir-viewing cameras (Woodget et al., 2015) or considering each camera perspective individually making the approach also suitable for Off-Nadir imagery (Dietrich, 2017; Mulsow et al., 2018). The re-fraction correction approach relies on the visibility of the submerged areas and on calm water conditions. Furthermore, accurate information about the water level is needed (Woodget et al., 2019). However, techniques exist to also estimate the position of the water level as another un-known parameter within a bundle adjustment procedure to reconstruct the bathymetry (Mul-sow et al., 2018). The error of the underwater area calculation increases with increasing water depth (Woodget et al., 2015) and turbulences can hinder reconstruction completely because of complex refractions (Entwistle & Heritage, 2019). Challenges still to overcome for this method are the impact of water depth, water turbidity and water colour because of the decreased image texture and the following success of SfM reconstruction (Kasvi et al., 2019).

The empirical approach relates the water depths measured in the field to pixel intensity values in the image (Flener et al., 2013; Lejot et al., 2007; Tamminga et al., 2015). However,

327

limitations for this regression approach are the need for reference measurements. Further-more, the river bed structure should be smooth and river flow conditions also have to be calm for correct results. 4.2.1.2 Granulometry

The measurement of grain sizes and their distribution, i.e. granulometry, is needed to assess flow conditions, e.g. turbulences and velocities, or to evaluate habitat conditions. Traditional granulometry approaches are either measurements performed in the field, which are very labour intense and high variability in the results are common, or photosieving, where grain sizes are measured in usually terrestrially acquired images either manually (Ibbeken & Schleyer, 1986) or automatically (Detert et al., 2017). UAVs enable the calculation of granulometry of entire river reaches due to the high-resolution imagery and thus significantly expanding the area of investi-gation of the traditional local sampling approaches.

To identify and classify grain sizes with UAV data on the one hand photosieving approaches are extended to the aerial imagery. This enables for instance the detection of multi-temporal changes of grain size distribution across point bar transects and therefore the quantification of change of the structure of an alluvial accumulation form due to flood events (Langhammer et al., 2017). Photosieving with UAV imagery has been improved to measure grain sizes in directly georeferenced imagery (chapter 4.2) without the need of any ground control or empirical data (Carbonneau et al., 2018) making the approach very useful for frequent mapping of large river reaches.

On the other hand, statistical relationships are established between image texture or topo-graphic characteristics and average grain sizes (Carbonneau et al., 2018; Woodget & Austrums, 2017). Image-based approaches use texture measures such as GLCM (chapter 3.2) and topogra-phy-based methods usually consider roughness estimates, e.g. correlating standard deviation of heights to predict grain sizes (Vázquez-Tarrío et al., 2017). Choosing one of the two approaches depends on the applied scale and sediment characteristic. In a case study by Woodget and Aus-trums (2017) for a reach smaller than 1 km and data with cm-resolution, roughness was a better predictor. However, in a subsequent study, grain sizes were smaller and imbricated causing that the same parameter did not perform as good because the grain size was not as well represented by topography (Woodget et al., 2018). Considering additional information such as the patch facies can improve the estimation with roughness but challenges still remain for less sorted sed-iment (Pearson et al., 2017).

In flat terrain image texture can be better suited because texture relies on grain edges, thus grains might be well distinguished in 2D images but not necessarily in the 3D data. If image

328

texture is used to predict grain sizes, images without blur are a prerequisite (Woodget et al., 2018). Furthermore, image analysis should be performed in the original single image rather the orthomosaic as the calculation of mosaics leads to distortion of the original image content (Woodget et al., 2017), due to the interpolation of orthomosaic pixel values from overlapping images and error propagation of the estimated image geometry and dense point cloud into the final map. 4.2.1.3 Change detection

Change detection of fluvial environments is needed to assess the geomorphic impact of floods, the change of channels due to changing environmental conditions and to monitor the success of river restoration. UAV data can be used to calculate sediment budgets, roughness changes, channel pattern alterations or to perform connectivity analysis.

The evolution of point bars below and above the water surface can be observed to derive con-tinuous wet-dry models (Flener et al., 2013). Furthermore, lateral bank shifts and river incision due to erosion during flood events and corresponding eroded volumes and changing channel patterns, such as river width and height variability, can be measured (Marteau et al., 2017; Miri-jovskỳ & Langhammer, 2015; Tamminga et al., 2015). Another novel possibility is the immediate observation of the impact of extreme events. The topographic and bathymetric data of pre- and post-flood events can be used in hydrodynamic models to estimate peak discharges and conse-quently assess how well these models can describe the actual processes. Thereby, ( Tamminga et al., 2015) could reveal that the impact of these extreme flood events is still not well understood and that changes of the morphology of the channel regime were mostly unrelated to pre-flood conditions. Furthermore, the propagation of knick-points, which again influence the spatial pat-terns of changes along the river, can be detected due to frequent data acquisition of river reaches (Marteau et al., 2017).

Automation of fluvial feature (e.g. ripples, deep and shallow areas, sidebars, river banks, grav-el, sand and vegetation) detection becomes necessary to assess the changes of these features over time covering larger areas. On the one hand, 2D information can be used only considering the images in combination with machine learning techniques (chapter 3.2) such as artificial neuronal networks (Casado et al., 2015), random forest classifiers (Feng et al., 2015) or using su-pervised classifications (Flynn & Chapra, 2014). Feature-based mapping can be extended from 2D to 3D information, as performed by Langhammer and Vacková (2018), who detected fresh and old gravel, sand accumulations and bank erosion and could identify flood affected area in relation to water depth.

329

4.2.1.4 River habitats

Due to the high spatio-temporal resolution of UAV data, it is now possible to measure contin-uums instead of single samples to evaluate river habitats, at least supporting well-established classification approaches (Woodget et al., 2017). UAVs can even modernise surface flow type mapping because they provide a continuous and quantitative mapping of rivers at microscale but covering mesoscale areas and thus can be advantageous over less reliable and more er-ror-prone traditional approaches because drones better capture the high spatio-temporal var-iability of river habitats (Woodget et al., 2016). To enable future frequent application of UAVs in fluvial geomorphology, for instance to implement the data in updated hydrodynamic mod-elling, direct referencing will be needed to capture river reaches at the km-scale (Hamshaw et al., 2017). 4.2.2 Soil erosion

Multi-temporal SfM (see DoD, chapter 3.4) by the recording of images with UAV platforms allow to measure soil surface changes on different scales in very high accuracy. Accuracies at a sub-mm level at low flying heights (below 10 m) and spatial resolutions of a sub-cm range are possible. However, to use UAV based photogrammetry for soil erosion measurement a very precise ground control point setup for multi-temporal data acquisition and sub-cm change de-tection is required (i.e. using stable reference points around the area of interest measured with mm-accuracy).

Although, UAVs are improving the assessment of soil surface change detection, measurements remain most challenging at the smallest scale, i.e. interrill or diffuse erosion, because accuracy and resolution requirements are very high (Pineux et al., 2017). Nevertheless, interrill and rill erosion at hillslopes can be quantified, as shown in a fragile loess landscape (Eltner et al., 2015). By comparing both erosion forms, it was revealed that as soon as rills were forming interrill erosion decreased significantly. When rill erosion is assessed with UAVs, it becomes necessary to automatically extract these forms to effectively quantify their eroded volume (Bazzoffi, 2015; Carollo et al., 2015).

Gullies, as a geomorphologic feature of erosion in the landscape were early documented by analogue cameras on kites and blimps. For instance, d’Oleire-Oltmanns et al. (2012) published results from a gully documentation from 1995 to 1998 with an analogue camera. However, early fixed-wing systems already outpaced classic approaches (D’Oleire-Oltmanns et al., 2012). As undercuts and steep sidewalls occur in particular in gully morphology, images taken close to nadir hinder a full 3D surface reconstruction. Thus, images from a bird´s eye perspective or a

330

combination of aerial and terrestrial images can therefore be beneficial with regard to precise volume calculation and hydrological analysis (Stöcker et al., 2015).

Badlands, as a landscape with a variety of erosional features, such as gullies, pipes, steep slopes and a dense drainage network can be monitored by UAVs. A detailed multi-temporal analysis in the Val d’Orcia, Italy with differences in vegetation cover, slope and aspect was mapped and different types of morphological changes across the catchment in sub-decimetre precision were revealed (Neugirg et al., 2016).

Besides height change measurements, high-resolution UAV data also enables to derive surface roughness, which is an important parameter for runoff formation and velocity (Eltner et al., 2018). Thereby, isotropic as well anisotropic roughness can be used to highlight for instance the importance of the connectivity of depressions across- and along the slope.

The unprecedented spatial resolution and accuracy of the UAV photogrammetry also allows to detect and measure other processes causing soil surface changes. Compaction, consolidation, swelling, shrinkage and hydrostatic impacts can cause complete masking of the erosion signal. Kaiser et al. (2018) highlight that for a robust and accurate soil erosion measurement further in-vestigation (i.e. clay mineralogy) of these processes is needed in the future to potentially correct their influence at the surface change model. 4.2.3 Gravitational processes

Geomorphological analysis of gravitational mass movements needs to incorporate neighbour-ing disciplines such as geology, hydrology and soil science to acquire a holistic image of a certain area of interest. In various cases, analysis is additionally supported by an anthropo-geographical perspective, as in densely populated and geomorphologically active regions, human settlements and ground movements may lead to conflicts. When planning a UAV-surveying campaign, this multi-disciplinary nature should be considered to produce data of interest for all potential users.

UAVs are beneficial to researches as areas of active slope failures, rockfall or slow earth flow are inherently difficult or dangerous to access. Early experiences with UAV applications in land-slide documentation date back a decade, when Rau et al. (2011) presented mapping results af-ter 2009 Typhoon Morakot triggered nationwide landslides. Different from recent applications, they flew at 1.400 m above ground to capture an area of 21.3 km². A GSD of 17 cm was sufficient to document damages. Repeated campaigns allow for change quantification (see DoD, chap-ter 3.4) at high spatial and desired temporal resolution. Clapuyt et al. (2016) monitored a ~17 ha landslide in the northern foothills of the Swiss Alps with an annual interval at a flying altitude of 60 m. This relatively low repetition time was considered suitable with regard to the in situ dy-namics. However, they explicitly pointed out that different natural hazards may need increased

331

frequencies, which UAVs are an appropriate tool for. Turner et al. (2015) mapped a 7,500 m² landslide at seven dates over four years and were able to separate surface areas according to their respective dynamics.

In surveying gravitational mass movements, a combination of platforms and sensors can be fruitful. Casagli et al. (2017) present various applications from satellite earth observation to ground-based systems with UAVs therebetween. The authors underline that choice of temporal and spatial resolution is case-dependent and can differ in all phases of an event. While satellite imagery is sufficient for post-disaster damage assessment, landslide inventories and mapping at e.g. basin scale, UAVs are recommended for periodic checks of detailed movements, volume measurements and rapid assessments at slope-scale. 4.2.4 Tectonics

The detection and monitoring of existing tectonic surface features allows for insights in tec-tonic processes and e.g. photogrammetric approaches from airborne data help by improving data density, especially in hardly accessible terrain. Deformations of the surrounding bedrock, orientation and shape and also the texture of dykes allow the reconstruction of formation condi-tions and thus reflect ancient events. Orthoimagery and point cloud analysis allows for detailed and precise mapping of such features and are therefore well-received in geological surveys. The advantage of high-density data acquisition goes along with a challenging analysis and interpre-tation due to the large amount of produced data. A need for semi-automatic handling of ortho-images, DEMs and point clouds is therefore a requirement for post-processing of such surveys. Vasuki et al. (2014) present an approach to map geological structures from high-resolution data, either fully automatic or semi-automatically. The latter reduces mapping duration from roughly seven hours in a classic manual method to 10 min. They test and make use of different feature detection approaches and segment linking in a MATLAB workflow to produce a structure map, including faults and dip directions unguided or controlled through user inputs.

Dering et al. (2019) sublime best practise instructions for mapping of dykes on unvegetated surfaces with assistance from the choice of UAV, depending on area to be covered and needed detail in the produced data to data analysis, interpretation and examples. High-end analysis can be achieved by surface brightness gradients and colour contrast, automatic and semi-automatic fracture and lithology mapping e.g. by freely available applications such as the CloudCompare plugins Compass and Facets or a QGIS’ GeoTrace and sophisticated 3D structural analysis tech-niques such as Lime (Buckley et al., 2017) or OpenPlot (Tavani et al., 2011). Both, 2.5D raster and orthomosaic in GIS-based analysis and 3D point cloud analysis offer high potential and de-pending on the application show their advantages over other approaches. The authors also point

332

out that for the monitoring of tectonic or dyke-related displacements and respective change de-tection approaches fixed ground control points should be avoided and rather recommend UAVs with differential GPS capabilities. The study of intrusive systems may also benefit from UAV-based sensors outside of the visible range, i.e. thermal infrared, hyperspectral, aeromagnetic and maybe gravitational measurements, but there is need for research in this domain.

A case study by Fazio et al. (2019) underlines the usability of UAVs in geological mapping in an experimental design that combines in-situ mapping with drones. On a hardly accessible cliff wall, they map geological features and point out the benefits of small airborne systems over terrestrial LiDAR or terrestrial photogrammetry, boat-based mobile laser scanners and total sta-tions. Their results point at tectonic conditions during bedrock formation and reveal differences in aperture width indicating different ages of genesis.

Another structural mapping approach is given by Vollgger and Cruden (2016), who pro-duced spatial datasets in sub-cm resolution for South-Eastern Australian basement and cover rocks. They offer detailed joint orientation histograms and produce a 3D structural trend model derived from the dense photogrammetric point cloud. Detailed bedding trend surfac-es could thus be derived and visualised and wavelength of anticlines and synclines could be reconstructed.

Tectonic control on geyser activity could be shown in a study by Walter et al. (2020), who com-bined an optical and TIR UAV and underwater cameras in an Icelandic geothermal field. They recommended a night time data acquisition for the thermal imagery to reduce solar interference with heated surfaces around the AoI. The produced thermal anomaly map enabled the authors to count and locate hot spots and link them to the seismicity of analogue orientation across the area. Supplemented by the underwater camera, the UAV survey proved the fracture-controlled nature of the geyser through recording the cross-sections of the conduit. 4.2.5 Marine and coastal applications

In particular, for the high dynamic coastal areas, monitoring by UAVs instead of GPS-based transects or any other surveying method is easier and faster. The detected changes as results of multi-temporal surveys enable to quantify these dynamics, e.g. for sandy beaches (Casella et al., 2016) and cliffs (Ružić et al., 2014). Dune development can be related to stabilising vegetation coverage achieved from the orthophotos (De Giglio et al., 2017; Hugenholtz et al., 2013; Nolet et al., 2018). Morphodynamics of foredunes at seasonal time-scales was studied by Taddia et al. (2019) and revealed an overall positive evolution of the system within two years, as well as erosion occurring in interdunal depressions and at the upper backshore. Furthermore, dif-

333

ferent magnitudes of changes of various forms were distinguished. Most changes occurred at the youngest embryo dunes, dunes at further distances to the shore migrated seaward and the back dunes remained stable. In another study, inland migration and average vertical accretion of foredunes as well as lowering of the beach during winter season due to high-energy waves was observed (Laporte-Fauret et al., 2019).

Single to larger fields of boulders dislocated by storm or tsunami events are an additional area for monitoring approaches, as SfM allows to reconstruct the dimensions of boulders, as well as to detect changes of moving boulders or their surrounding area. These (3D) boulder dimensions are also used in hydrodynamic equations in order to estimate necessary wave heights or veloci-ties for the dislocation process (Autret et al., 2018; Hoffmeister et al., 2020).

However, the acquisition by UAVs can only be applied under calmer wind conditions. Cor-rosion by marine spray can cause severe damages and permits are important. Depending on visitors, other survey options, e.g. terrestrial laser scanning (Hoffmeister et al., 2020) or kites (Autret et al., 2018) might be necessary. Likewise, the application of image matching in tex-tureless sandy beach environments can be challenging and other sensors, i.e. LiDAR, might be preferable. Solazzo et al. (2018) observed a significantly higher point density with ULS (UAV based laser scanning) compared to the image-based 3D reconstruction approach. In addition, ULS can partly penetrate vegetation cover and therefore allow for more accurate estimation of dune growth below sediment catching plant patches. For monitoring approaches, fixed surveying points or any other constant targets are important, usually surveyed by real-time kinematic (RTK) measurements or total stations. All of these studies use a raster-based DoD approach or apply a comparison of point clouds from different time-steps (chapter 3.4 and chapter 3.5). The results of these monitoring approaches can be compared to modelled or measured wave heights or inundation depths and show insights on storm impact and recovery (Turner et al., 2016).

Besides these monitoring applications, nearshore bathymetry can indirectly be extracted by analysing the speed of wave crest lines (Matsuba & Sato, 2018). The detection and tracking of marine species are possible from fixed images out of recorded videos (Colefax et al., 2018) and the successful mapping of meadow areas by object-based (OBIA) segmentation of the orthopho-tos (Ventura et al., 2018). For all previous investigations, low-cost UAVs with small and simple RGB-based cameras were used. In contrast, hyperspectral images were successfully applied in a more complex approach by Parsons et al. (2018) for mapping of coral bleaching. Likewise, multispectral and thermal imagery shows an enhanced potential for wildlife detection (Colefax et al., 2018). However, for all surveys, the clearness of the water (normally measured in Secchi depths) is hampered by the turbidity of the water, effects of sun glare, as well as wave heights and shoaling. Typically, surveys should be conducted close to midday, with mostly calmer water and less shadow.

334

References for further reading

336

4.3 UAVs in hydrology

Flavia Tauro

4.3.1 Streamflow monitoring ........................................................................................................... 3364.3.1.1 Surface flow velocity field .......................................................................................... 3384.3.1.2 Water level ................................................................................................................... 338

4.3.2 Land surface – atmosphere interactions ............................................................................... 3394.3.2.1 Soil moisture ............................................................................................................... 3394.3.2.2 Evapotranspiration ..................................................................................................... 3414.3.2.3 Accuracy and limitations .......................................................................................... 3424.3.3 Comments and recommendations ........................................................................................ 342Close-range remote sensing through UASs is revolutionizing hydrological sciences by afford-ing the observation of novel variables and by increasing the temporal and spatial resolution at which natural phenomena can be observed. Airborne-based remote sensing is bridging the gap between ground-based sensing systems and satellites: not only have UASs offered refined obser-vations, but also large spatial coverage and flight repeatability.

UAS-based remote sensing can significantly contribute to unveil the inherent complexity of hydrological processes. Indeed, water phenomena occur at heterogeneous spatial scales, span-ning from micro-rills up to the entire catchment. Also, such processes evolve rapidly in time, and potentially continuous or frequent observations may highly advance our comprehension of the response of natural systems. Historically, hydrological sciences have been increasingly en-hanced by experimental studies, which are though often expensive, time-consuming, and risky (if, for instance, observations during extreme flood events are considered). UAS-based remote sensing has mitigated all these criticalities by enabling the mapping of fine-scale details as well as allowing non-invasive observations. Limited costs (as compared, for instance, to satellite mis-sions) as well as simplicity of use of UASs have contributed to the spread of such approaches to numerous research groups, organizations, and the public sector worldwide. Despite such unde-

337

niable advantage, taking UAS-based observations to the level of standard measurement systems is still a challenge.

UASs have been equipped with a multitude of sensors (RGB cameras, thermal infrared cam-eras, multispectral, hyperspectral cameras, etc) to dissect diverse aspects of natural catchments. Streamflow, vegetation dynamics, soil moisture, and evapotranspiration are some of the hydro-logical processes and aspects whose comprehension has improved thanks to UAS remote sens-ing. Many of these earth system observations intersect with diverse realms of science and can be found in other chapters of this book. Regarding hydrological applications, the innovative use of UASs has enabled considerable advances in flow monitoring and land surface – atmosphere energy fluxes estimation. 4.3.1 Streamflow monitoring

The estimation of flow discharge, streamflow in the rest of this chapter, is of paramount impor-tance to hydrological modelling and engineering practice. Streamflow is traditionally estimated through rating curves, which are relationships experimentally established between water level and flow discharge at selected cross-sections along the stream. The development of rating curves relies on the acquisition of the bathymetry and velocity at the stream cross-section through the deployment of expensive and bulky equipment (such as, for instance, current meters or acoustic Doppler current profilers). Such experimental campaigns are expensive; furthermore, measurements are not taken in challenging conditions (during floods or difficult-to-access en-vironments), which may pose personnel and equipment at risk. Due to technical complexities in developing rating curves, such relationships are not frequently updated and morphological changes of the bathymetry are rarely considered. Also, expensive and time-consuming cam-paigns have led to a gradual decrease in gauging stations in Europe since the 1990s, with small hydrological catchments (less than 500 km2) lacking hydrometric observations (Tauro et al., 2018a). In developing countries, these issues are typically exacerbated (van de Giesen et al., 2014; Feki et al., 2017).

Once rating curves are determined for a selected stream cross-section, gauging stations are installed to monitor water level. Existing gauging stations mostly feature point-wise sensors, such as ultrasonic meters and radars, which afford non-contact estimation of the water level. However, such measurements are related to a single point along the cross-section and may not be representative of the actual flow dynamics occurring in the stream.

338

Figure 4.3-1: Sketch of the measurements currently enabled by UASs in river systems.

All images were prepared by the author for this chapter.

Th e use of UASs has opened new frontiers towards the observation of streamfl ow (Tauro et al., 2018c; Manfreda et al., 2018). Latest eff orts largely encompass the use of such platforms to: i) reconstruct the stream surface fl ow velocity fi eld along stream reaches of several squared meters, ii) estimate water level, and iii) develop stream bathymetry models. Th e measurements of such parameters may revolutionize the way streamfl ow is currently measured. Even if UAS-based remote sensing does not (or at least, not yet) solve problems such as directly capturing 3D streamfl ow characteristics, this approach is highly innovative. In fact, reconstructing the stream surface fl ow velocity fi eld or the water level of the free surface may directly lead to streamfl ow estimation without the need for traditional point-wise gauging stations. Also, distributed, rather than point-wise, measurements of the stream surface are now feasible and have the potential to shed new light on several processes, including river erosion and ecosystem dynamics. Obvious-ly, traditional ground-based measurements are still needed to verify and improve UAS-based observations. Figure 4.3-1 displays a sketch of the latest measurements enabled in river systems by UASs. In most cases, the evaluation of these parameters relies on image acquisition and pro-cessing.

339

4.3.1.1 Surface flow velocity field

Digital images have been successfully adopted in fluid dynamics laboratories to noninvasively (that is, without deploying sensors and probes in the flow) visualize the flow and to quantitative-ly reconstruct the 2D and 3D velocity field (Adrian, 1991; Raffel et al., 2007). A similar approach has been implemented on UAS platforms retrofitted with cameras (mostly RGB and thermal) and, in some cases, with laser devices that create reference points in the field of view ( Tauro et al., 2015; 2016a,b and Detert & Weitbrecht, 2015) to reconstruct the 2D surface flow velocity field in natural rivers.

Data acquisition consists in flying the platform (typically a multirotor but images recorded from fixed wing systems may be adapted and utilized as well) in the hovering mode above the region of interest (frequently, the region spans several meters along the stream and includes both stream banks) for a few minutes. The onboard camera axis can be either orthogonal or at an an-gle with respect to the water surface, whereby inclined cameras enable the acquisition of larger fields of view. The camera captures high-definition videos of the stream surface, where floating objects may be naturally transiting or artificially dispersed (Powers et al., 2018). 4.3.1.2 Water level

Water levels have been estimated from UASs adopting an array of diverse technologies. Minia-ture lidar systems have been mounted onboard UASs to estimate both water level and bathym-etry. Green wavelength lidars, scanning lasers, and NIR lasers have afforded measurements at accuracies of a few centimeters (Höfle et al., 2009; Mandlburger et al., 2016; Huang et al., 2018). Such systems can typically be affected by difficulties in discriminating between the returns from the water surface and stream bed. The level of turbidity (suspended particles) and the optical properties of natural river beds are crucial factors for the reconstruction of river topography. In some cases, river beds reflect less light than a Secchi disc, thus hindering the estimation of the water depth level (Flener et al., 2013). Also, the air-water interface and synchronization between sensors are major technical issues which sensibly influence measurement accuracy.

Radars, sonars, and custom-built developed camera-based laser distance sensors have been used in Bandini et al. (2017b, 2018) to measure the range of the platform to the water surface. The orthometric water level has then been retrieved by subtracting such range from the Glob-al Navigation Satellite System (GNSS) receiver mounted onboard the platform. Such tech-nologies have proved accuracies of approximately 4 cm. UAS photogrammetry offers much lighter-weight payloads than lidars towards the estimation of water level (Ridolfi & Manciola, 2018).

340

Surface energy fl uxes highly infl uence the water cycle and water resources management. In re-cent years, remarkable eff orts have entailed the use of UASs rather than traditional instrumenta-tion to facilitate the remote estimation of land surface atmosphere interactions. In the following, we present latest results on soil moisture and evapotranspiration observations, see Figure 4.3-2.

Data acquisition for surface energy fl uxes observations can be executed with both multirotor and fi xed-wing platforms. Sometimes, the UAS is conveniently fl own in the autonomous mode and images can be automatically georeferenced using information on the UAS attitude.

Typically, UASs are equipped with multispectral sensor payloads at sub-meter resolution. Th e fl ight mission is pre-programmed with GPS-waypoint navigation. In most cases, data may ex-hibit diverse spatial resolution (for instance, thermal imagery tends to have worse defi nition than RGB), and data re-sampling and interpolation is necessary. Importantly, UAS-based ac-quisitions frequently need ground-based calibration and are oft en complemented with ground sampling to estimate soil characteristics and texture, fi eld capacity, and wilting point.

Figure 4.3-2: Sketch of UAS-based measurements of surface energy fl uxes. 4.3.2.1 Soil moisture

Soil moisture is a fundamental driver of physical processes in natural ecosystems and a cru-cial parameter for agricultural management. It is traditionally measured with in situ sensors

341

(gravimeters, time and frequency domain reflectometers, and neutron probes) which typically lead to accurate estimations in areas of limited extension and upon laborious campaigns (An-dreasen et al., 2017; Bogena et al., 2015). However, the use of such probes on large scale environ-ments can be rather impractical and time-consuming.

Surface soil moisture (that is, moisture within the first 10 cm of the soil) can be remotely estimated through satellite-based earth observations. Optical, thermal, multispectral, and mi-crowave remote sensing have demonstrated to provide accurate representations of the land surface-atmosphere flux exchanges with minimum parameterization (Petropoulos et al., 2009). However, satellites also offer low revisit times, are affected by cloud cover, exhibit low spatial resolutions and may not overpass the entire globe (Wang et al., 2018a). UASs mitigate several of the satellite criticalities: they can be flown at lower altitudes thus affording higher spatial resolu-tions; platforms can monitor difficult-to-access areas at high temporal frequencies, at low costs and in cloudy periods.

Approaches to estimate soil moisture with UASs leverage existing remote sensing methods. They include optical sensing (Filion et al., 2016; Anne et al., 2014), integrated approaches that combine optical sensing and thermal infrared observations (Carlson, 2007) and microwave remote sensing methods (Kornelsen & Coulibaly, 2013). One of the first instances of UAS plat-forms for surface soil moisture estimation is the AggieAir, a 14-pound fixed-wing platform that can fly up to one hour at a speed of 30 miles per hour (Jensen et al., 2009). Surface soil moisture maps can be transferred onto large scale areas based on the relation with image-based vegetation indices. Distributed acquisitions for such indices are then used as inputs to soil moisture models. Examples of such indices include the normalized difference vegetation in-dex (NDVI), enhanced vegetation index (EVI), and vegetation condition index (VCI), among others. Computation of the indices relies on the acquisition of high-resolution imagery from UASs in the visual spectrum, near-infrared, and infrared/thermal bands. Images in the visible bands also allow for generating digital elevation models, which can be helpful to reconstruct orthorectified mosaics.

To develop surface soil moisture maps, optical vegetation indices and ground-based measure-ments serve as inputs to models. Alternatively, the thermal inertia approach relates soil mois-ture to the difference in maximum and minimum soil and crop canopy temperatures during the day (Idso et al., 1975). The crop water stress index is another methodology that relates the components of the surface energy balance to changes in soil moisture. Several machine learn-ing approaches exploit vegetation indices to calculate soil water content (Hassan-Esfahani et al., 2015, 2017). In Wang et al. (2018b), soil moisture at the root-zone (down to 30 cm) is esti-mated by coupling a modified version of the temperature-vegetation triangle approach (from thermal, multispectral, and RGB imagery sensed with a multirotor) with information on sur-

342

face roughness (related to the aerodynamic resistance to heat transfer) gathered from the struc-ture-from-motion technique. 4.3.2.2 Evapotranspiration

Increasing global population and climate variability are challenging water resources avail-ability and management. Improved agricultural yield as well as enhanced resilience against water shortage can be achieved through multi-sensor high spatial and temporal resolution observations. In this vein, remote sensing missions have proved instrumental for monitoring evapotranspiration at large spatial scales. Specifically, thermal imaging and hyperspectral and multispectral measurements have been widely adopted as proxy for evapotranspiration estima-tion (Price, 1982; Govender et al., 2007). Unlike satellites, UASs offer much more time-refined observations at higher spatial resolution, which have the potential to significantly improve ag-ricultural practice, such as the timing and amount of crop irrigation (Kustas et al., 2018) and the selection of genotypes that are resilient to water deficit (Ludovisi et al., 2017).

Similar to surface soil moisture, proximal sensing-based evapotranspiration frequently relies on energy balance models fed with thermal and multispectral imagery. UASs simultaneously capture images at diverse bands that are processed to yield vegetation indices and to estimate the energy balance components (Ortega-Farías et al., 2016). Surface energy balance models have been developed since the 1940s based on the assumption that the rate of exchange of heat and mass between the ground and atmosphere is caused by a difference in the potential of the land surface-atmosphere system as well as by resistances due to the local land and vegetation properties (Kalma et al., 2008). One-source surface energy balance models treat energy fluxes between soil, vegetation and the atmosphere, whereby no distinction is made between evapora-tion from the soil surface and transpiration from the vegetation (Monteith, 1965). On the other hand, two source-models regard the evapotranspiration fluxes as the sum of the contributions from the soil surface and vegetation (Shuttleworth & Wallace, 1985). These approaches involve meteorological variables which are frequently estimated from local sparse networks of weather stations at the time of UAS surveys. Also, camera radiometric calibration with ground-based measurements is often required.

Evapotranspiration has been estimated with two-source energy balance models in agricultural crops at very high resolution (Hoffmann et al., 2016). To this end, thermal images have been captured from fixed wing or multirotor UAVs, mosaicked, and input to energy balance models. Such data need to be complemented with meteorological variables typically obtained at local stations and eddy covariance towers. In Wang et al. (2019a), image resolution consistent with

343

canopy size (1.5 m) is found to be sufficient to capture the spatial heterogeneity of evapotran-spiration fluxes. 4.3.2.3 Accuracy and limitations

UAS-based sensing approaches afford surface soil moisture and evapotranspiration estimation at large scales, which is advantageous given the high heterogeneity of such parameters and their dependence on a multitude of factors (such as, for instance, vegetation, topography, human ac-tivities). Also, these proximal sensing methods can be employed to yield quantitative estima-tions at incredibly refined spatial resolutions (meter level).

A recent study has proposed a methodology to fully exploit the potential offered by UASs earth observations by temporally interpolating sparse estimates of land surface variables, such as soil moisture. This approach interpolates land surface state variables obtained from UAS-based snapshot data to upscale instantaneous to daily observations. In the future, UAS observations may be complemented with a few meteorological and remote sensing data to yield temporal-ly continuous land surface-atmosphere flux exchanges at high spatial resolution (Wang et al., 2019b). 4.3.3 Comments and recommendations

Despite the promise demonstrated by the use of UASs in stream flow monitoring and energy bal-ance estimations, the technology is still struggling to become a standardized procedure. Many data acquisition procedures, including planning the flight mission and processing parameters, heavily rely on the expertise of the user. Regarding stream flow monitoring, even if a general agreement has been achieved on the image processing workflow toward surface flow velocity extraction, no guidelines have been developed to inform the choice of stabilization approaches and velocimetry algorithms. While it is generally agreed that velocimetry algorithms may exhib-it diverse performance based on flow regime or the presence of tracers, major efforts still focus on the development and enhancement of velocity estimation tools rather than on collaborative activities toward the standardization of the procedures.

In the realm of energy fluxes estimation, platforms are typically fitted with numerous piec-es of instrumentation (visible, thermal, and multispectral cameras), which considerably raise costs and manageability. In some cases, the integration of data sources with diverse spatial and temporal resolution may involve the use of machine learning algorithms and sharpening tools that are needed to combine data of different dimensionality, thus complicating data handling

344

and extraction. Also, ground-based measurements are still essential for running energy balance models.

Current challenges that still need to be addressed involve flying UASs in windy and rainy conditions. Inaccuracies in image stabilization can be highly detrimental for measurements and accurate data geolocalization, even if platform technological ameliorations are expected to sensibly mitigate these issues in the near future.

References for further reading

346

4.4 UAVs in forestry

Markus Hollaus and Livia Piermattei

4.4.1 UAV images for forestry .......................................................................................................... 3474.4.1.1 Forest structure parameters ...................................................................................... 3504.4.1.2 Forest mapping and classification and forest health .............................................. 3524.4.1.3 Forest biomass ............................................................................................................ 3554.4.2 UAV LiDAR for forestry .......................................................................................................... 3574.4.3 Synthetic Aperture Radar ........................................................................................................ 3594.4.4 Strength, limitations and future directions of operational applications ........................... 3604.4.5 Conclusions............................................................................................................................... 363Unmanned Aerial Vehicles (UAV) platforms are emerging as one of the most promising remote sensing technology to provide data for research and operational applications in a wide range of disciplines, including forestry.

This chapter provides a coherent synthesis and framework by which UAVs can be used through passive and active sensors in forest-related disciplines. The general goal is to advise foresters and forestry-oriented researchers on choosing, by means of examples, the appropriate UAV and sen-sor according to the application. Furthermore, we summarise suitable approaches to get reliable results and information on forest systems and their dynamics. The focus of this chapter is on the use of UAV images and on UAV LiDAR data and their derived forest information by including the applied methods and achievable accuracies. Furthermore, a brief description of radar UAV applications in forestry is given. Finally, the challenges and technical considerations of UAVs for possible operational applications in forestry are discussed.

Forests are viewed, defined, and assessed from different perspectives (Chazdon et al., 2016). The formal definition of forest is based on the economic, social, ecological and political value of the tree-covered land. Furthermore, each country has its own legal forest definition(s), whereas

347

the FAO provides a worldwide one (FAO/FRA, 2000). Forests can be classified according to the amount of human alteration (plantation or ‘natural’ stands), on climate, and on the predominant tree species composition (i.e. broadleaf trees, coniferous or needle-leaved trees, mixed and tropi-cal forest). In general, forests are composed of layers such as forest floor, shrub layer, understory, canopy, and emerging trees in a tropical forest, each with a different set of functions.

The management of forests is referred to as forestry, i.e. the science and practice of manag-ing, using, preserving, monitoring and creating forests, woodlands and associated resources for multiple uses. The overall goal of forest management is to create a sustainable or maintainable forest that continues to grow and produce its goods and benefits (FAO/FRA, 2000). This can be done in many different ways such as reforestation, even- and uneven-aged methods, controlled burns or selective and reduced impact logging. Depending on the management objectives and methods, the information about the following issues is required: forest health and diseases, tree species, canopy height, tree growing, timber stock, stand density (i.e. the number of trees per area), canopy cover (i.e. the area of ground under the tree canopy), forest gaps (i.e. abrupt ver-tical changes occurring between trees), tree crown, basal area (i.e. the section of land that is occupied by the cross-section of tree trunks), definition of boundaries and acreage, as well as biological inventory. For national and local forest inventories additional parameters such as e.g. vertical canopy structure, wood quality, tree positions, topographic and moisture condition are required, which are traditionally collected by in-situ field measurements in a regular interval, need to be measured and observed.

Sustainable forest management requires accurate spatial information in high temporal and spatial resolution (Imangholiloo et al., 2019). Although airborne remote sensing is widely used in forestry, one of the most critical barriers to their applications is the lack of timely data col-lection over target areas (Tang und Shao, 2015) and the cost of data collection especially for developing countries.

On the contrary, with a UAV platform the data acquisition over small to medium areas is quite flexible and, thus, the surveying can be repeated in shorter time intervals, which can be practical for monitoring under different phenological condition or after a meteorological or anthropo-genic event (Tang und Shao, 2015). Moreover, the UAV platform can be equipped with different sensors targeted to the forest parameter.

UAVs has experienced increasing scientific attention in forestry in the last years, as summa-rised in a number of reviews (Assmann et al., 2019; Banu et al., 2016; Frey et al., 2018; Goodbody et al., 2017b; Hernandez-Santin et al., 2019; Liu et al., 2018; Pádua et al., 2017; Tang und Shao, 2015; Torresan et al., 2016; Zhang et al., 2016). The most commonly used terms in these articles are emphasized in Figure 4.4-1.

The construction of the platform, e.g. fixed-wing, copter or a combination of both, determines the ground coverage, the payload, the starting, and landing capability, the flight time, stability,

348

and quality (Adler et al., 2018). The operational altitude of a UAV in forestry usually varies from 50 to 300 m above ground (for small UAVs). Smallest UAVs can also fly close to the forest can-opy (~20 m) and custom platforms also under the forest canopy (Chisholm et al., 2013; Jiang et al., 2016; Krisanski et al., 2018a; Krisanski et al., 2018b; Tang und Shao, 2015). However, so far there has been very limited research on the potential of below-canopy UAV for forest mapping.

Often, a fixed-wing UAV is a more suitable platform for covering large areas: up to 200 ha with one flight and up to 1000 ha in one working day in optimal weather and topographic conditions using multiple batteries on a lightweight fixed-wing UAV (Giannetti et al., 2018). Multi-cop-ters are also used, especially for multispectral and hyperspectral sensors. The types of sensors currently used in forestry are digital true colour (i.e. RGB) cameras, multispectral cameras and LiDAR (Light Detection and Ranging) followed by hyperspectral cameras, thermal detectors/cameras, and radar.

Figure 4.4-1: Sketch of the keywords used in the review papers on UAV in forestry (Jason Davies visualization). Unless otherwise stated, all images were prepared by the author for this chapter. 4.4.1 UAV images for forestry By applying Structure from Motion (SfM) photogrammetry and image matching algorithms, UAV image sequences are commonly processed into coherent data sets, providing structural

349

and spectral information of the canopy surface in the form of reflectance orthomosaic, three-di-mensional (3D) point cloud, and 2.5D digital surface models (DSMs). The latter is often used to determine the canopy height model (CHM) of forests by subtracting the height of the ground in the form of digital terrain models (DTM).

There is a growing body of literature on the use of UAV image-based technologies for forest studies. In order to provide an overview of the trend of recent studies and applications using UAV images in forestry, we conducted a comprehensive literature search of scientific studies such as accessible journal paper, conference proceedings articles and official thesis using Google Scholar and citation tracking. We searched the terms ‘UAV’, ‘unmanned aerial vehicle’, ‘UAS’, ‘unmanned aerial system’, and ‘drone’ in combination with the terms ‘forest’, ‘forestry’, ‘invasive species’, ‘forest fire’, ‘vegetation’, ‘canopy’ in the time period from 2012 to 2019 (Figure 4.4-2). The reviewed articles are categorized into the following thirteen UAV applications: forest pre-post harvesting, biodiversity, fire, monitoring, health, above ground biomass, structural parameters, and methods. Figure 4.4-2 compares the number of studies per application and shows the num-ber of studies compared to the date of publication for each application. In this figure, we have only reported the keywords for each application. However, the full description of each classified application, as well as the list of references for each application, are reported as supplementary material. Please note that the statistics shown in the figure are calculated on the basis of the found articles. Studies are assigned to only one application; however, some applications overlap. For example, the canopy height, which is a forest structure parameter, is also calculated in the forest inventory, used for the detection and segmentation of individual trees and as a variable for above ground biomass estimation.

Our literature search reflects the general trend of an increasing numbers of UAV studies (35 % were published in the past three years) and UAV applications in forestry over the last three years (Figure 4.4-2). Moreover, the literature search pointed to the recent innovative ap-plications of UAV, in various fields of forestry such as the estimation of phytovolume (i.e. the volume under vegetal canopy), insecticide effect in forest (Leroy et al., 2019), light transmis-sion and canopy shadow effect in river temperature models (Dugdale et al., 2019), seedlings detection (Feduck et al., 2018; Imangholiloo et al., 2019), soil disturbance from forest ma-chinery (Pierzchała et al., 2014), monitoring greenhouse gas emissions from forests (Mlambo et al., 2017), census of an Endangered Plant Species (Rominger und Meyer, 2019) and liana infestation (Waite et al., 2019).

More popular is the use of UAV imagery to support the estimation of forest structure and forest inventory parameters, forest fire management, and biodiversity characteristics such as canopy gaps and dead wood (Inoue et al., 2014) and species identification. Other tasks include monitoring of protected areas, harvesting activities, forest change and recovery, and the mon-itoring of forest health and pest infestation. Furthermore, a wide variety of studies have inves-

350

tigated the technical and methodological challenges of using UAVs in forestry (Figure 4.4-2, Methods) and the integration of UAV data with terrestrial images (Mikita et al., 2016), terrestrial LiDAR(Aicardi et al., 2017; Mtui, 2017), aerial LiDAR (Kotivuori et al., 2020) and satellite data (Abdollahnejad et al., 2018; Martin et al., 2018; Martínez-Sánchez et al., 2019; Navarro et al., 2019; Puliti et al., 2018; Rossi et al., 2018).

Figure 4.4-2: Applications of UAV imagery in the reviewed studies from 2012 to 2019.

Number of articles versus the application category (top) and number of articles versus publication date for each application (bottom). The reference list for each application is reported in the appendix.

351

Based on our literature research, forest structural parameters are the one that has received the highest attention followed by forest biomass, forest health and forest classification (Figure 4.4-2, the highlighted sectors). These applications are discussed in the following sub-chapters. 4.4.1.1 Forest structure parameters

Forest structure refers to the spatial (i.e. vertical and horizontal) arrangement of the compo-nents of a forest ecosystem and describes properties such as the distribution and abundance of vegetative elements (Lindenmayer et al., 2000). In detail, forest structural components include tree density, tree position, canopy cover and tree crown area, tree species composition, foliage distribution, canopy gaps, and light penetration and availability for the understory vegetation (Palace et al., 2016). In the field of forest inventory, forest parameters also include the presence of death trees, basal area, diameter at the breast height (DBH), trunk size distribution, and tree canopy height. The latter is often expressed through descriptive statistics such as the maximum height, mean height (i.e. the arithmetic mean of heights), standard deviation of the height values as well as the coefficient of variation of heights and height percentiles often calculated between 10 % and 90 %. These structural components are important indicators for the investigation and modelling of forest dynamics, biological diversity and ecological processes (Awad, 2017; Lu et al., 2016; Molinier et al., 2016; Rahimizadeh et al., 2019) and their measurements are also used to estimate biomass and growing stock volume (Grznárová et al., 2019) and to derive disturbance mechanism.

Technological advances in remote sensing related to forestry contribute to the arrival of rela-tively new term such as “Precision Forestry”, i.e. the provision of reliable, accurate and detailed information on the structural and ecological aspects of forests with high spatial and temporal resolution, even at the individual tree level (Holopainen et al., 2014).

Airborne laser scanning (ALS) data has been tested for efficacy in measuring forest structure properties through and under the top of the canopy (Sullivan et al., 2014). Due to the similarity between UAV photogrammetric and ALS point cloud in the sense that vertical information can be represented in a dataset (White et al., 2013), UAV photogrammetric point clouds are increas-ingly used for calculating forest structure parameters (Balenović et al., 2017; Banu et al., 2016; Birdal et al., 2017; Bohlin et al., 2012; Dandois und Ellis, 2013; Goodbody et al., 2017a; Hird et al., 2017; Jayathunga et al., 2018a; Mohan et al., 2017; Ota et al., 2017; Torresan et al., 2016; Vastaranta et al., 2013; Zarco-Tejada et al., 2014). However, it is worth noting that compared to ALS, photogrammetric point cloud does not provide the same level of penetration into the can-opy, and therefore cannot provide the same level of information on the vertical stratification of vegetation layers and the terrain (Torresan et al., 2016; White et al., 2013) (Figure 4.4-3).

352

Figure 4.4-3: Point clouds derived from UAV LiDAR (Riegl VUX1 – blue-green-yellow dots) and Sony Alpha imagery (red dots) over a forest scene. Profile width is 1 m.

The extraction of canopy height and vertical canopy profile from UAV images often relies on the algorithms developed for LiDAR data (Silva et al., 2015). Therefore, the potential of using spectral and textural information is not fully exploiting yet. In addition, several studies have highlighted that UAV photogrammetry often requires an auxiliary very high-resolution DTM (González-Jaramillo et al., 2019; Messinger et al., 2016; Ota et al., 2015; Ullah et al., 2019), such as ALS-derived DTM, to generate accurate information of canopy height and consequently stem volume and basal area. Only for forests with low stem density, 3D points from the terrain and consequently a DTM can be derived from UAV imagery. For example, Lin et al. (2018) demon-strated the feasibility of deriving a photogrammetric DTM and tree height from oblique RGB photographs for a sparse subalpine coniferous forest.

Based on UAV image data structural forest parameters are commonly extracted from the CHM model rather than the 3D point cloud. Among the forest parameters, the estimation of the tree height and crown diameter from UAV imagery received the highest attention, likely because from these measurements individual tree characteristics can be estimated (e.g. stem diameter and volume).

To extract tree positions and heights from UAV CHMs or point clouds, local maxima algorithms are commonly used (Abdollahnejad et al., 2018; Guerra-Hernández et al., 2018; Mohan et al., 2017). Other studies derived the tree position based on a segmentation into tree crowns and the tree heights based on the highest DSM values within the tree crown segments (Ganz et al., 2019). However, local maxima approaches based on CHM works efficient for forests that have a well-defined apex, where tree tops are sufficiently separated from each other, where tree heights are uniform (e.g. conifers), where there are no trees hidden under or between taller and larger ones as in mixed and/or multi-layered forests (Balsi et al., 2018).

353

Concerning the tree height estimation Alexander et al. (2018) developed an alternative ap-proach for assessing the height of emergent trees in a tropical rainforest only from DSM and the slope of the DSM without the requirement for a terrain model.

Recent studies have shown very high precision (i.e. repeatability) of within-season tree height growth measurements of individual trees or forest stands using UAVs (Dempewolf et al., 2017; Guerra-Hernández et al., 2017; Krause et al., 2019; Mohan et al., 2017), although the topic merits further study. Multitemporal UAV surveys have also been successfully conducted for ecological monitoring (Zhang et al., 2016) and for quantifying the leaf phenology of individual trees (Park et al., 2019).

One aspect common to almost all studies is that ALS and field-based measurements are used for validation by analysing the root mean square error (RMSE) and the R2 for each parameter. Several studies comparing UAV and ALS canopy height report R2 values of 0.8 or higher (Dan-dois und Ellis, 2013; Jensen und Mathews, 2016; Lisein et al., 2013; Torresan et al., 2016; Zahawi et al., 2015; Zarco-Tejada et al., 2014). In comparison to indirect field-based measurements, which are however also potentially subjected to error propagation (Larjavaara und Muller-Lan-dau, 2013; Wang et al., 2019b), the R2 ranges between 0.63–0.84 (Fankhauser et al., 2018). High accuracy was also found in comparison to terrestrial LiDAR (Roşca et al., 2018).

The differences between structural metric estimates from ALS and UAV images are attributed to the limited ability of UAV images to penetrate the canopy layer, hence the overestimation of lower height percentiles and canopy density values. In addition, it is worth noting that differ-ences in observed accuracy values are due to differences in data sources, in the variation of the surveyed forest types, flight configurations, image acquisition parameters, camera resolution, ground control points (GCP) and processing workflows used. 4.4.1.2 Forest mapping and classification and forest health

The characteristics of the sensor (i.e. spectral and spatial resolution) play an important role in the ability to monitor forest health, recognize plant diseases, map tree species and classify forest types and land cover from UAV image-based technologies. The following paragraphs summa-rise the main methodologies currently used for forest mapping and classification by means of UAV imaging technologies and provide an overview of the use of multispectral, hyperspectral and thermal technologies for forest health assessment. Information on state-of-the-art remote sensing of forest health and on the integration of spectral technology on UAV platforms can be found in the extensive reviews provided by Hall et al. (2016), Senf et al. (2017), Lausch et al. (2016; 2017), and Aasen et al. (2018). The detailed description of different types of UAV sensors and their calibration goes beyond the scope of this chapter.

354

Understanding the spatial distribution of individual trees, their species and size is important for biodiversity assessment, forest biomass prediction, ecosystem services and in general in the sustainable management of forest resources. For example, the relationship between DBH and biomass is species-specific, and therefore there is an increasing need to classify tree species with high accuracy.

Classification of tree species, land cover and determination of vegetation species are often performed with semi-automatic approaches rather than only manual mapping. The classifica-tion method and the features used for classification play an important role in the accuracy of the classification. In classification processes, spectral, spatial and temporal features derived from UAV images are used independently or combined. The incorporation of temporal features (i.e. based on UAV time-series images) to help classifying tree species is not fully exploited yet while spectral and spatial characteristics are widely used for object-based analyses. For tree species classification, spatial-based features can be textural images or segmentation of crown size and shape, crown closure and stand density. For tree crown delineation and crown diameter extrac-tion, the use of the watershed segmentation approach applied to UAV DSM or CHM (Grznárová et al., 2019) in combination with manual single tree crown delineation on orthomosaic (Iizuka et al., 2017) is very common. However, the visual quality of the photogrammetric CHM varies between stand species and forest density (Lisein et al., 2013).

Spectral features take advantage of different forest structures and chlorophyll content to differentiate forest types (e.g. broadleaf forest versus needle forest). Spectral information can be derived directly from the UAV orthophoto, but more often the vegetation indices are calcu-lated from the orthophoto according to the available spectral band and specific purpose. Mul-tispectral sensors are more powerful than the visible camera to detect the spectral response from the forest canopy and other surfaces. Multispectral sensors on board of UAV usually op-erate in the visible, red-edge and near-infrared (NIR) spectral regions. Based on the NIR and Red band, the normalized difference vegetation index (NDVI) is commonly used to assess the greenness of the trees, to detect dead trees and to delineate canopy gaps as well as to estimate biophysical parameters. In addition to NIR images, RGB images are also used to calculate various vegetation indices as a basis for forest mapping and classification (Zhang et al., 2019).

Multispectral UAV sensors were used to monitor changes in land cover (Minařík und Lang-hammer, 2016), and tree species classification (Gini et al., 2014; Gini et al., 2018; Komárek et al., 2018) as well as for mapping forest health (Brovkina et al., 2018), insect damage (Lehmann et al., 2015), disease outbreak (Dash et al., 2017) and estimating forest canopy fuels (Shin et al., 2018). A comprehensive literature review of the current state of UAVs for invasive alien plant research is provided by Dash et al. (2019).

As reported, many variables from spectral features, vegetation indices, texture, and structural information can be used for forest classification and mapping. However, to date, it is not yet clear

355

how different features and data sources influence land cover or forest classification and which classification algorithm provides the best performance.

Pixel-based classification (PBC) was conducted on orthorectified RGB and multispectral UAV images to assess tree density, tree height and canopy cover (Durfee et al., 2019). However, PBC only works on the spectral features (i.e., reflectance values) of each pixel to assign class labels according to specified ranges (Fraser und Congalton, 2019). As a result, crown textures, gaps, and shadows reduce the accuracy of the classification. With the availability of very high spatial resolution images from UAV, object-based classification (OBC) is the predominant choice for reducing spectral variability within the classes. In fact, OBC works with groups of homogeneous and contiguous pixels, also known as segments, as basic elements to perform a classification (De Luca et al., 2019; Torres-Sánchez et al., 2015). Therefore, OBC also takes into account spatial characteristics to differentiate classes (Bothra et al., 2017).

Among the different classification algorithms from statistical-based algorithms (e.g., cluster analysis, k-nearest neighbour, maximum likelihood) to machine learning algorithms (e.g. sup-port vector machine, decision tree and artificial neural networks) in the literature search, ran-dom forest classification algorithm (Breiman, 2001) is the most used non-parametric learning algorithm. This algorithm is successfully applied for tree species classification (Franklin und Ahmed, 2017), forest regeneration monitoring (Goodbody et al., 2017a), and for selecting the most important 3D metrics and spectral features for further inspections (Imangholiloo et al., 2019; Saarinen et al., 2017).

Hyperspectral imageries from UAV have recently been used for detecting damaged and dead trees (Näsi et al., 2015), identifying the different stages of bark beetle infestations (i.e., healthy, infested, and dead trees) (Näsi et al., 2018), in mapping biodiversity indicators (Saarinen et al., 2018) and identifying tree species (Cao et al., 2018; Li et al., 2019b; Nevalainen et al., 2017; San-dino et al., 2018) and vegetation classification in general (Yan et al., 2019). These studies high-light the great value of hyperspectral data for vegetation classification and forest health manage-ment and its advantages over RGB imagery and multispectral data. According to the authors, the tree species classification based on hyperspectral imagery can further be improved by adding structural information (i.e. three-dimensional point cloud or surface model) of forest canopies. This information helps to reduce misclassification due to the shade and varying illumination conditions and to discriminate species with similar spectral signatures, but different structural characteristics (Cao et al., 2018; Sankey et al., 2017).

Thermal sensors that provide the temperature of the plant/forest canopy (Zarco-Tejada et al., 2012) are commonly used for stress detection (Junttila et al., 2016), due to the linear relation-ship between leaf or canopy temperature and transpiration (Maes & Steppe, 2012) (i.e. higher

356

canopy temperatures likely lower transpiration rates). Using airborne thermal imagery, Scherrer et al. (2011) assessed the drought sensitivity of deciduous forest tree species. Berni et al. (2009) were the first to test a helicopter-based UAV equipped with inexpensive thermal and narrow-band multispectral imaging sensors for estimating water stress detection and canopy tempera-ture for vegetation monitoring. Recently UAV thermal systems are employed to monitor surface temperature dynamics on distinct land cover classes e.g. disease-induced canopy temperature rise (Smigaj et al., 2015), high-stress level in conifer forests (i.e. 1.5°C temperature difference) (Smigaj et al., 2017) and to quantify phenotypic traits of moderately stressed and non-stressed trees (Ludovisi et al., 2017). Other UAV thermal investigations have shown a correlation be-tween canopy temperature depression and disease level (Smigaj et al., 2019). Specifically, Maes et al. (2018) demonstrated that infrared thermography based on UAV provides a new method to study the plant-water relations of mistletoe and their host plants. With UAV thermal infrared images Lapidot et al. (2019) showed that the transpiration rate is close to the measured values of direct gas exchange. 4.4.1.3 Forest biomass

Biomass refers to the amount of material accumulated by plants in a unit area (McKendry, 2002). Forest biomass is the main index to measure the carbon sequestration capacity of a forest (Ré-jou-Méchain et al., 2019), and consequently the carbon emission from deforestation and forest degradation. In this respect, there is an increasing need for consistent monitoring of forest bio-mass under the Reduction of Emissions from Deforestation and Forest Degradation (REDD+) program.

As the underground portion of forest biomass is difficult to obtain, the Aboveground Biomass (AGB) of forest is usually estimated (Lin et al., 2018). The largest part of forest biomass consists of wood (70 % to 90 % of AGB) in which the dominant trees (~25–30 m) contain more than 75 % of total carbon (Cuni Sanchez und Lindsell, 2016). Therefore, the most important predic-tors of AGB of a tree are its trunk diameter, total height, wood specific gravity, and forest type (dry, moist, or wet) (Chave et al., 2005).

Forest AGB is rarely directly measured (Qureshi et al., 2012) and thus, indirect estima-tions are mainly achieved by the biomass and forest yield models, carbon flux measure-ments, forest inventory-based approaches, and remote sensing methods (Qureshi et al., 2012). Remote sensing based methodologies either infer biomass through relationships between field-based estimation of biomass and spectral signal, or through estimations of some other forest variables and employment of allometric analysis (Lu et al., 2016). Within both approaches, actual forest cover, forest type and forest species mapping as well as tree

357

height and DBH are very important source of information (Galidaki et al., 2016). The latter two parameters are commonly measured in forest inventories to estimate AGB by applying allometric equations.

Several studies have shown that from ALS data AGB and carbon stock can be estimated accurately in various forest types (Asner et al., 2011; Gobakken et al., 2012; Hansen et al., 2015; Ioki et al., 2014; Lefsky et al., 2002; Montagnoli et al., 2015; Næsset et al., 2013; Næsset et al., 2004). In recent years, UAVs have been gradually utilized in AGB estimations of forests as an alternative to using ALS (Lin et al., 2018; Messinger et al., 2016). Although most of the literature in this respect is focused on UAV-based LiDAR data (Balsi et al., 2018; Brede et al., 2017), recent studies demonstrate that repeated UAV imagery can be used to estimate changes in the AGB, for instance linked to selective logging in tropical forests (Ota et al., 2019). Similarly, Jayathunga et al. (2018b), demonstrated that the digital photogrammetry of UAV imagery, when combined with LiDAR DTM can be used effectively for the estimation of plot-level stem volume and carbon stock of uneven-aged mixed conifer-broadleaf forest, with comparable accuracy to ALS data. Their reported RMSE of the UAV-estimated volume was comparable to other studies that used UAV-photogrammetry with LiDAR DTM (Puliti et al., 2015; Tuominen et al., 2015).

Since the UAV-based biomass estimation relies on the availability of reliable DTMs, Kacham-ba et al. (2016) tested different ground filtering to generate DTM from UAV imagery in miombo woodlands. Except for the DTM based on shuttle radar topography mission (SRTM), the dif-ferences between the tested DTMs were minor when comparing to the final biomass estimates. In similar environment, Domingo et al. (2019) assessed the influence of image resolution, cam-era type and side overlap on prediction accuracy of biomass constructed from ground-based data and UAV data. The results showed that a reduction of side overlap from 80 to 70 %, while keeping a fixed forward overlap of 90 %, might be an option for reducing flight time and cost of acquisitions without decreasing the achieved accuracy. The analysis of terrain slope effect in biomass predictions showed that error increases with steeper slopes, especially on slopes greater than 35 %, but the effects were small in magnitude.

However, it is known from other ongoing studies that a reduction of the overlap below 85/85 % can lead to alignment errors especially for leaf-off data sets. Therefore, it is strongly recommend-ed the have a side and forward overlap > 85 %.

To estimate the AGB in a natural tropical mountain forest, González-Jaramillo et al. (2019) tested two methods based on UAV RGB images, from where they derived the tree height and DBH, and the second based on multispectral camera used to calculate the NDVI index. Their study found that the NDVI-based AGB estimates were less accurate due to the saturation effect in dense tropical forests, while the RGB photogrammetric approach provided reliable AGB (Mg/ha) estimates comparable to LiDAR surveys.

358

LiDAR is widely used in forestry applications because of its ability to provide 3D information of canopy structure and terrain information, even under dense canopy cover. The main product from LiDAR is a 3D point cloud, which is the basis for deriving high resolution DTMs, DSMs, CHMs, as well as normalized point clouds and consequently a multitude for forest parameters. In forest areas with dense crown cover, the advantage of LiDAR towards photogrammetry is more pronounced thanks its ability to penetrate vegetation retrieving terrain information. This would merit the choice of the more expensive LiDAR scanner in applications where the vertical distribution of vegetation is of importance (Jensen et al., 2018), and/or where high-resolution DTMs from national LiDAR acquisitions are not available.

The first UAV-LiDAR system optimized for forestry applications is described in Wallace et al. (2012). The developed TerraLuma UAV-LiDAR system is a low-cost UAV system that com-bines GPS, IMU LiDAR and High Definition (HD) camera data. The developed workflow for processing the data fuses observations from GPS, IMU and HD video camera to determine the precise trajectory, which is a pre-requisite to achieve high geo-location accuracies of the derived 3D point cloud. Furthermore, Wallace et al. (2012) assessed the feasibility of UAV-based LiDAR for monitoring high resolution changes within an Eucalyptus Nitens plantation, in Tasmania, Australia. They used their developed TerraLuma UAV-borne LiDAR system mounted on a mul-tirotor UAV to acquire point clouds for extracting plot level forest metrics. Within this study, they could confirm the repeatability of assessing these forest metrics with high accuracy.

Chisholm et al. (2013) used a LiDAR mounted on a UAV without any localization device for mapping a 20 x 20 m forest patch of roadside trees. They could detect trees greater than 20 cm DBH with an accuracy of 73 % within a 3 m flight path. Smaller and more distant trees could not be de-tected reliably. The DBH for the detected trees could be assessed with an absolute error of 18.1 %.

Due to the fast development in the UAV and LiDAR sensor domains Amon et al. (2015) pre-sented the survey-grade Riegl VUX-1 UAV mounted on a RiCopter. Several research groups used this system for acquiring high precision 3D points from forest areas. For example, Brede et al. (2017) presented first UAV-based LiDAR data with accuracies comparable to TLS point clouds. Furthermore, Brede et al. (2017) reported that from this UAV-based LiDAR data DBH could be assessed with a correlation coefficient of 0.98 and a RMS of 4.24  cm compared to TLS-derived DBHs. For estimating DBH they applied Quantitative Structural Modelling (QSM) (Raumonen et al., 2013) combined with cylinder fitting. Also Wieser et al. (2017) summarize that DBH > 20 cm can be reconstructed with almost 100 % with relative differences to the refer-ence DBH of 9 % (DBH 20–30 cm) down to 1.8 % for DBH > 40 cm. They used a cylinder fitting approach for a dense LiDAR data set acquired over a complex alluvial forest scene in Austria.

359

In addition to the Riegl VUX Scanner also Velodyne scanners were used in different studies. For example, Liu et al. (2018) used UAV-LiDAR data to estimate forest structural attributes (i.e., DBH, Lorey’s mean height, basal area, stem density, volume and AGB) for ginkgo plantation forests un-der different silvicultural treatments. They used a Velodyne Puck VLP-16 sensor mounted on a GV1300 multi-rotor UAV platform. The flight altitude was approx. 60 m above ground level, re-sulting in 160 pts m-2. Based on plot- and individual tree level metrics derived from LiDAR point clouds, different classification approaches (i.e., PLS, k-NN model and RF) were evaluated to derive the forest attributes. The found out that models based on both plot-level and individual-tree level (CV-R2 = 0.66–0.97, rRMSE = 2.83–23.35 %) performed better than models based on the plot-level metrics only (CV-R2 = 0.62–0.97, rRMSE = 3.81–27.64 %). For the point cloud density sensitivity analysis, the canopy volume metrics showed a higher dependence on point cloud density than other metrics. Individual-tree results showed relatively high accuracies (F1-score > 74.9 %) when the point cloud density was > 16 pts m-2, whereas the correlations between AGB and the metrics of height percentiles, lower height level of canopy return densities and canopy cover appeared stable across different point cloud densities (i.e. point cloud density reduced from 80 pts m−2 to 8 pts m−2).

Also, Guo et al. (2017) used a Velodyne Puck VLP-16 LiDAR scanner for acquiring 3D data for different forest types in China. They conclude that very high-resolution 3D terrain and canopy height models, canopy cover, LAI and AGB information can be derived from LiDAR data, which opens new possibilities to provide comprehensive 3D habitat information for biodiversity studies.

Furthermore, Yin & Wang (2019) used UAV-based LiDAR data, acquired with a Velodyne HDL32E LiDAR scanner mounted on an eight-rotor UAV platform, for extracting individual mangrove tree parameters (i.e. position of tree, tree height, crown size) by applying a marker controlled watershed segmentation algorithm to the CHM. The flying height was 40 m above ground, resulting in a main point density of 91 pts m-2. They could delineate 46 % of the field measured mangroves, which was promising considering the complexity of mangrove forests.

Wang et al. (2019a) used UAV-LiDAR as sampling tool to combine field plots and Sentinel-2 imagery for mapping height and AGB of the mangroves on Hainan Island in China. The UAV-Li-DAR data was acquired with a Velodyne VLP-16 Puck sensor mounted on a DJI M600 UAV. The flight altitude was about 52 m above ground resulting in a mean point density of all collected LiDAR data of 94 pts m-2. From the UAV-LiDAR data a DTM, DSM, and height-, density- and canopy volume metrics were derived for grids with a cell size of 10 x 10 m, comparable with the Sentinel-2 pixel. The UAV-LiDAR derived metrics serve as input for a random forest-based ap-proach for mapping AGB and height of mangroves. The results show that the UAV-LiDAR based estimation models for AGB and canopy height performed better than the traditional remote sensing method that directly relates ground plots and Sentinel-2 data. Furthermore, the results show that the UAV-LiDAR metrics describing the canopy thickness are the most important var-iables for mangrove AGB estimation.

360

Finally, an extensive summary about the potential of ultra-high-density drone LiDAR data for forestry application is given by Kellner et al. (2019). They conclude that the derived 3D model can clearly resolve branch and stem structure, which is comparable to results derived from ter-restrial laser scans. 4.4.3 Synthetic Aperture Radar

The use of Synthetic Aperture Radar (SAR) mounted on UAVs can only be evaluated in a few academic studies. Indeed, there are many acquisition challenges especially for estimating forest parameters that still need to be overcome such as the UAV flight planning according to the beam width (i.e. the area covered) and the radar measurement geometry patterns (incidence angle and spatial resolution), as well as the impact of forest structure and site characteristics (slope, aspect, soil moisture content) (Robinson et al., 2013). Furthermore, SAR data are technically challeng-ing to process. In this context, in the study of UAV SAR, researchers focused mainly on system design, signal processing, and data acquisition (Aguasca et al., 2013; Dewantari et al., 2018; Ding et al., 2019; Edwards et al., 2008; Essen et al., 2012; Li et al., 2018; Lort et al., 2018). For instance, the Finnish Geospatial Research Institute developed a Ku-band UAV-borne profiling radar (i.e., waveform) to better understand the backscatter radar response for forest mapping and invento-ries. The application over boreal forests showed that the profiling radar successful detected the top of the forest canopy and the ground surface with accuracy comparable to simultaneous Lidar measurements (Piermattei et al., 2017) (Figure 4.4-4).

Figure 4.4-4: (a) Ku-band vertical profile and Lidar points within one footprint cone and

(b) the comparison between one profiling radar waveform and the corresponding

Lidar points for a high tree (Piermattei et al., 2017. Originally published under a Creative

Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/)).

361

4.4.4 Strength, limitations and future directions of operational applications

The UAV technology in forestry undoubtedly implies many advantages, such as frequent data collection and low operating cost and, thus, it has great potential to become the new operational standard for small forest properties. In fact, aerial photogrammetry based on UAVs allows any user like private forest owners, to purchase a relatively cheap platform and to acquire high-reso-lution 3D data and even process it on their desktop.

In the last year, the UAV market has made huge strides in supporting novice UAV users by continuing to invest in fully autonomous UAV solutions, with continued improvements in the software and sensors, providing training and even data processing for those without computer experience (Paneque-Gálvez et al., 2014). In this direction, several companies are growing, such as OpenForests, Delair, DroneDeploy, TimberDrone, Mosaicmill just to cite few of them that support forest managers and organizations to acquire UAV data (camera and LiDAR) over small and medium-size forest properties as well as to assist them on the data processing and analysis. This type of service contributes to a significant advance of the operational use of UAVs in forest-ry and in general of the interest in UAV forest surveys.

Despite these advances, the operational use of UAVs in forestry is still in an early stage. In fact, there remain a number of inefficiencies and limitations in the use of UAV data, their collection, and processing which are addressed below.

At present, UAVs equipped with LiDAR sensors are considerably more expensive compared with UAVs equipped with digital consumer cameras (Roşca et al., 2018). Unlike ALS, photo-grammetric UAV point clouds are generated through image matching only on surfaces captured by the camera, which is reflecting in the absence of points below dense forest canopy. This makes it very difficult to generate a reliable terrain model in dense forests from only UAV photogram-metry data (Torresan et al., 2016), which is essential for deriving canopy height (Dandois und Ellis, 2013; Lisein et al., 2013; Puliti et al., 2015; Tuominen et al., 2015). Moreover, the placing of ground control points can be challenging as the points under the canopy are not well visible on imagery and thus hinder accurate georeferencing. Another limiting factor is wind during the image acquisitions, which leads to errors in the image matching process.

The option of using freely available DTM data e.g., SRTM-DTM failed to produce accurate canopy height estimations because of low resolution (e.g. 30  m) and errors associated with SRTM-DTM (Jayathunga et al., 2018a; Su und Guo, 2014). Therefore, an accurate high-resolu-tion DTM such as LiDAR DTM is recommended to achieve accurate normalization of photo-grammetric point clouds from UAV in case of dense canopy (Jayathunga et al., 2018b). Further-more, it is worth noting that for the accurate estimation of canopy structure and canopy height

362

a high-quality co-registration of all involved data sources is required, which often represent a significant challenge.

As observed in many comparisons of photogrammetric-CHM with LiDAR-CHMs, photo-grammetric CHMs tend to be overestimate the canopy heights as a result of occlusions, (i.e. point clouds could not penetrate to ground level to define crown boundaries), shadows and smoothing (Saarinen et al., 2017). Particularly, coniferous stands with numerous and abrupt fine-scale peaks and gaps in the outer canopy seem to suffer more from the smoothing effect induced by the dense-matching (Lisein et al., 2013).

The major limitation of UAVs’ application that hinders their operation independently by the equipped sensor, is the limited flight endurance (Torresan et al., 2016) and thus the mapped are-as. A trade-off between areal coverage, which is primarily a function of flying height and sensor viewshed, and point-cloud density or resolution is always necessary. Therefore, further work is required to fully explore the potential of areal upscaling from the scale of individual trees and small forest stands, to the geometric characterization of entire forests and plantations. The use of larger aircraft powered by gasoline engines will allow data collection over much larger areas with more advanced imaging sensors. For example, UAV under development at the Wake Forest UAV Lab will provide the capability to carry 5 kg of sensors or other equipment for over four hours, allowing for coverage of approximately 13 000 ha per flight at a GSD of 7.7 cm (Messinger et al., 2016).

Another aspect to consider is that for the possible combinations of sensors and flight configu-rations, it is still unclear what the optimal methods might be for accurately measuring/mapping forest parameters using these techniques (Dandois et al., 2015).

Within the planning phase of image acquisition, various decisions must be made which influence the results in a significant way. In fact, the quality of UAV photogrammetric point cloud such as geometric positioning accuracy, point cloud density, and canopy pen-etration, estimates of canopy structure, and point cloud color radiometric quality varies as a function of different observation conditions and acquisition strategies e.g., the images overlap or the flight altitude (Dandois und Ellis, 2013), and the choice of the sensor for a given application.

Some research was undertaken to define an appropriate template for UAV acquisition in for-ested areas, although the efficiency of the used method depends on the complexity of the forest’s stand structure (Brieger et al., 2019). Dandois et al. (2015) evaluated the flight height and image overlap for the reconstruction of tree heights, canopy penetration, and point cloud density. The 3D point cloud density and the canopy penetration are strongly related to forward photographic overlap. They showed that 80 % photographic side-overlap and 80 m altitude above the canopy with optimal conditions of clear skies resulted in estimates of canopy height that were highly correlated with both field and LiDAR estimates of canopy height (R2 = 0.86 and 0.99, respective-

363

ly). Similarly, Balenović et al. (2019) found that high image overlaps, contributed considerably to the accuracy of image orientation. Additional slight improvements were achieved by replacing single-frequency GNSS measurements with dual-frequency GNSS measurements. However, the application of this methodology to more geometrically complex multilayered forest environ-ments remains a question for further research.

Choosing the sensor should always be a trade-off between the potential gain in accuracy, spec-tral information and the cost associated with more sophisticated sensors (Barbedo, 2019). In forestry, the UAV-based photogrammetry system mainly carries RGB and multispectral cam-eras since the data acquisition and processing of hyperspectral and thermal sensors on board of UAVs are more complex and limited to good weather conditions. Their operational limita-tions concern the need of necessary pre-flight operations (e.g. spectral calibration (Lucieer et al., 2014)) and post-flight pre-processing (e.g. radiometric and geometric corrections (Hrus-ka et al., 2012)) to ensure the usefulness of hyperspectral and thermal information (Aasen et al., 2018; Adão et al., 2017; Proctor und He, 2015). Furthermore, to date, there are only a few single-camera systems that allow collecting hyperspectral and structural information from the same sensor (Honkavaara et al., 2012). Similar issue for thermal images. Webster et al. (2018) are the first to combine thermal and RGB images simultaneously acquired by the UAV platform, to generate separate thermal and RGB point clouds of 3D structures. Currently, the processing of 2D thermal imagery to produce fully 3D models containing thermal information has to be fully explored in the context of forest canopy structure in the future. For the derivation of accurate plant temperature measurements from UAV thermal imagery, users need to be aware of the impact of environmental factors such as air temperature, humidity, radiation, wind speed (Lei-nonen et al., 2006), and the amount of shaded leaves at the canopy level (Gonzalez-dugo et al., 2013). Some experiments have shown that the most favourable time of day to acquire thermal images is around midday (Berni et al., 2009) and that measurements should always be taken at the same time of the day. Other aspects requiring further examination include the impacts of target emissivity and sensor calibration, error characterization and spatiotemporal non-uni-formity corrections, and identification of in-flight effects (i.e., wind-speed, directional viewing effects, and ambient temperature) on sensor stability and temperature estimation (Kelly et al., 2019; Malbeteau et al., 2018).

Improvements in sensor resolution are needed for identifying individual diseased trees. In case of hyperspectral images, the greater the number of bands, the lower the resolution for each spectral band (Iseli und Lucieer, 2019), with effects on the level of spatial detail and, therefore, on the range of flight heights and the size of the regions that can be captured (Guijun et al., 2017). Li et al. (2019a) integrated a low-cost multisensory UAV system, composed of a GNSS receiver, an IMU, a global shutter camera, a multispectral camera, and a laser scanner. Such a multi-sensor system enables the fusion of imagery and laser scanning data for reliable forest

364

inventory applications. However, further research is needed to use such multi-sensor systems for operational forest applications. 4.4.5 Conclusions

The use of UAVs for the monitoring and protection of forests and other natural resources is cur-rently in an expansion phase, encouraged by the constant development of new UAV platforms, sensors and software solutions. From UAV LiDAR data high precision topographic models, for-est structure parameters such as tree position, tree height, crown shape and size, crown coverage, vertical structure distribution, LAI, etc. can be derived. For extracting 3D models of the stem and branches only survey-grade UAV-LiDAR systems are capable to acquire data with the re-quired accuracy until now. Based on such high precision 3D LiDAR data several forest parame-ters can be derived in a highly automated way, which can be integrated into operational forestry applications. The increasing accessibility in terms of cost and size for LiDAR sensors along with data combining methodologies will highly improve the utilization of UAVs in forestry. However, UAV image-based technologies (e.g., RGB, multispectral, or hyperspectral) currently provide an alternative cost-effective data source to UAV LiDAR and conventional remote sensing data. Particularly where field data collection can be costly, field locations can be hardly accessible, or the use of remote sensing to complement the field sampling is advisable. In addition, UAV im-age-based technologies have consistently proven to be useful for forest mapping, classification of forest and tree species in a wide variety of forest types. Future generations of UAVs will contin-ually evolve and offer increased flight time and improved sensors. Therefore, future applications will include studies over a large range of forestry fields, covering a large variety of situations that occur in the operational management of forests.

References for further reading

366

4.5 UAVs in cryosphere research

Mark W. Smith, J. Chambers and Jonathan L. Carrivick

4.5.1 Mapping surface features ........................................................................................................ 3674.5.2 Topographic data of glaciated terrain .................................................................................... 3684.5.3 Detecting bed topography and layering within snow and ice ............................................ 3694.5.4 Quantifying snow depth .......................................................................................................... 3704.5.5 Quantifying glacier melt rates and retreat ............................................................................ 3714.5.6 Ice velocity measurement ........................................................................................................ 3724.5.7 Estimating aerodynamic roughness ...................................................................................... 3734.5.8 Albedo measurement ............................................................................................................... 3744.5.9 Surface temperature measurement ........................................................................................ 3744.5.10 Challenges and future opportunities ................................................................................... 375Earth observation methods have long been used to supplement cryosphere field campaigns as part of detailed and systematic survey and monitoring programmes. This has in part been due to the large scale and inaccessibility of the cryosphere. However, the challenging weather, low air pressure, poor reception of GPS signals and hazardous ice and snow-covered terrain has meant that adoption of UAVs by cryospheric scientists has lagged behind that within some other environmental sciences. Indeed, in a previous systematic review of UAV studies in glaciological research, Bhardwaj et al. (2016) identified just 20 studies using UAVs in glaciological research.

367

Figure 4.5-1: Cartoon showing the main applications of UAVs in the cryosphere. (A) mapping surface features (chapter 4.5.1). (B) collecting topographic data from glaciated terrain, e.g. terminal moraines and lateral moraines, (chapter 4.5.2). (C) using ice penetrating radar to observe internal structure and bed topography (chapter 4.5.3). (D) measuring snow depth

(subtracting DEMs with snow from snow-free DEMs) (chapter 4.5.4). (E) calculating melt and retreat rates using DEMs of diff erence (chapter 4.5.5). (F) obtaining ice velocity from feature tracking across multi-temporal imagery (chapter 4.5.6). (G) estimating aerodynamic roughness from microtopographic data (chapter 4.5.7). (H) observing changes in albedo caused by diff erences in refl ectance, e.g. by cryoconite (chapter 4.5.8). (I) detecting variation in surface temperature caused by debris cover such as medial moraines (chapter 4.5.9). All fi gures were prepared by the authors for this chapter.

368

Despite these challenges, there has been a proliferation of UAV use in cryospheric re-search over the last four years with an order of magnitude increase in published research utilising the technology. The primary motivation for this increased popularity is the spatial resolution of UAV imagery; centimetric ground sampling distances outperform very-high resolution satellite imagery and, for the first time, enable detailed glacier-scale observation of a variety of surface features. As such, this chapter aims to review the scope of glaciolog-ical applications of UAVs (Figure 4.5-1) and provide examples of progress and challenges of each.

From the first applications of UAVs, the resulting aerial imagery has been used primarily as a mapping tool (4.5.1). In particular, topographic surveys of glacial areas (4.5.2) are able to cover larger areas than ground-based surveys. Radar instruments mounted on UAVs provide further information on the subsurface ice and snow structures (4.5.3). Through repeat aerial survey, topographic changes can be quantified, typically to determine snow depths (4.5.4) or melt rates (4.5.5). Similar comparisons of aerial imagery provide glacier surface velocity estimates (4.5.6). Interrogation of UAV-derived imagery or topographic data permits calculation of specific prop-erties of ice and snow, such as aerodynamic roughness (4.5.7), albedo (4.5.8) or temperature (4.5.9). 4.5.1 Mapping surface features

Aerial imagery from UAVs provides an opportunity for centimetric resolution mapping of sur-face features. Within the cryospheric sciences, an early application of UAV imagery for mapping was that of Hodson et al. (2007) which identified the spatial concentrations of cryoconites across the snow-free surface of the Midtre Lovénbreen glacier in Svalbard. Using a supervised classifi-cation of UAV images, cryoconites could be mapped accurately; however, small, dispersed gran-ules of cryoconite (< 0.25 cm2) could not be resolved. On the same glacier, Rippin et al. (2015) used a UAV to map the supra-glacial drainage network and used their UAV imagery to identify a relationship between channel density, surface roughness and surface reflectance which has important implications for glacier surface energy balance.

The mapping of larger scale features is typically undertaken using more readily available sat-ellite imagery; however, UAV data are still crucial to identify biases in satellite-based methods and to aid the interpretation of the satellite imagery. In this way, Inoue et al. (2008) obtained UAV imagery of melt ponds on sea ice on the Beaufort Sea. By thresholding RGB colour-distri-bution histograms in the UAV imagery, pond concentration was mapped and used to identify a negative bias in estimates derived from satellite passive microwave-based observations. At the same site, Tschundi et al. (2008) also used UAV imagery to validate melt pond concentration es-

369

timates from the daily MODIS surface reflectance product and observed that reflectance-based estimates performed well. More recently, Wang et al. (2018c) used a similar method to map melt pond fraction over arctic sea ice.

Snow extent has also been mapped automatically by classifying UAV-based orthophotographs (Niedzielski et al., 2018) (see also chapter 6.5.5). Similarly, at the Forni glacier in the Italian Alps, Fugazza et al. (2015) developed a semi-automatic approach to mapping surface features from UAV imagery that was shown to outperform satellite-based approaches and identify much smaller features, including individual crevasses. 4.5.2 Topographic data of glaciated terrain

Mapping of surface features from UAV imagery is often combined with acquisition of topo-graphic datasets. While UAV-based LiDAR systems have been applied to map ice topography (e.g. Crocker et al., 2012), more commonly topographic datasets are obtained from UAVs via Structure-from-Motion (SfM) photogrammetry (chapter 2.2) which have been shown to per-form well even on relatively featureless ice surfaces in validation tests against laser altimeter data (Solbø & Storvold, 2013). The aerial images and topographic derivative data sets act in combi-nation to provide an effective geomorphological mapping tool and permit identification and quantitative analysis of a variety of supraglacial and especially proglacial features (Figure 4.5-2).

The requirement for a well-distributed network of accurately surveyed Ground Control Points (GCPs) can limit the applicability of SfM photogrammetry over large scales or in areas of in-accessible or hazardous terrain, as typically encountered in cryospheric research (Carrivick et al., 2016). To circumvent this requirement, Chudley et al. (2019) demonstrate an alternative ‘direct georeferencing’ approach (chapter 2.1) where the location of the imagery is recorded to a high accuracy. This enabled the production of decimetre-scale accuracy topographic models over the calving front of Store Glacier in western Greenland and has since been adopted by other researchers working in such inaccessible environments (Jouvet et al., 2019a). Certainly, the increased accuracy of directly georeferenced SfM topographic models represents an exciting future development to further facilitate UAV-based research in the cryosphere.

Proglacial applications of UAV-based topographic data include mapping and dimension anal-yses of drumlins (Clayton, 2012) and flutes (Clayton, 2017). Early tests against total station data note that UAV-based topographic mapping of formerly glaciated areas can be effective in areas without dense vegetation cover (Tonkin et al., 2014). At Isfallsglaciären in arctic Sweden, Ely et al. (2017) derived a geomorphological map from a 2 cm horizontal resolution orthophoto and Digital Elevation Model (DEM) obtained via UAV. The dataset compared favourably to larger valley-scale DEM obtained via Terrestrial Laser Scanning (Carrivick et al. 2015) and was used

370

to clearly identify moraines, fans, channels and flutes alongside the association of the latter with the presence of boulders. Ewertowski et al. (2019) undertook a similar study at the foreland of Hørbyebreen, Svalbard to identify flutes, ridges and crevasse traces. They even suggest that the high-resolution imagery can be used to provide an impression of clast shape and aid geomor-phological interpretations. By upscaling patch-scale relationships identified between grain size and surface roughness to UAV-based topographic data, Westoby et al. (2015) present a distrib-uted grain-size map of Antarctic moraines to inform sedimentological characterization. Dąbski et al. (2017) undertook such geomorphological interpretations of periglacial landforms on King George Island from fixed-wing UAV-based images and derived topography. The resulting data-set is sufficiently detailed for polygons of classified landforms to be established (e.g. solifluction landforms, scarps, taluses, patterned ground) and their relative surface cover quantified.

Figure 4.5-2: Mavic Pro 2 UAV (left) used as part of a glacier-scale SfM topographic survey at Quelccaya ice cap, Peru (example UAV image, right).

Furthermore, UAV-derived topographic data has been used as input for hydraulic modelling of glacial outburst floods and as part of hazard assessment in general, as demonstrated by Watson et al. (2019) in the Himalayas. 4.5.3 Detecting bed topography and layering within snow and ice

More experimentally, UAV-mounted sensors have been used to map features beneath the sur-face. Leuschen et al. (2014) used a dual-frequency UAV-mounted radar in Antarctica to obtain the first ever successful glacier bed topographic data from a UAV. Keshmiri et al. (2017) later deployed the same UAV-based system to obtain a radar echo sounding of Russell glacier in Greenland (also reported in Rodriguez-Morales et al. (2017) and Arnold et al. (2018)) where de-

371

ployment of ground-based radar is problematic owing to insufficient snowfall to infill crevasses and thereby render the glacier travel unsafe by snowmobile and sledge even in winter.

Similarly, Jenssen et al. (2016) presented an Ultra Wide Band (UWB) radar used to meas-ure snow layering in avalanche starting zones and identify potential failure planes for slab avalanches to form. Preliminary results compared well with density observations and, while clearly in an early experimental phase, provide a promising alternative to upscaling from often high-risk point-based snow pit observations. The UWB radar was mounted on a UAV flown just 1 m above the surface (Jenssen et al., 2018) and could identify detailed snow stra-tigraphy. 4.5.4 Quantifying snow depth

One of the more common applications of UAVs in cryospheric research is the production of detailed distributed maps of snow depth (Sturm, 2015). Such maps are very important for wa-ter resource management in alpine areas; yet prior to the availability of UAVs, snow depth was notoriously difficult to measure or even estimate over mountainous terrain. UAV-derived snow depth maps have been produced over a range of spatial scales and terrain types and using a number of different depth calculation methodologies. While Hawley & Millstein (2019) used assumptions on the structure of underlying topography to use a UAV to quantify snow drifting around structures at Summit Station, Greenland, more commonly, snow depth is obtained by acquiring topographic data via UAV-based SfM photogrammetry and subtracting a summer reference topographic model.

In an early application for mapping the extent of avalanche debris, Eckerstorfer et al. (2015) used a 10 m topographic basemap in the absence of a summer reference. Across multiple studies, validation of UAV-based snow depth estimates indicates that sub-decimetre accuracy can be obtained where ground control is available in favourable conditions (Vander Jagt et al., 2015; De Michele et al., 2016; Bühler et al. 2016; Harder et al., 2016; Cimoli et al., 2017), with direct georeferencing approaches exhibiting twice that error (Vander Jagt et al., 2015). The approach improves on the previous use of interpolated point measurements of snow depth (Bühler et al., 2016) while allowing the spatial structure and auto-correlation of snow depth to be investigated (e.g. Redpath et al., 2018). UAV-based snow depth estimates yield similar errors to those ob-served from piloted aircraft (Nolan et al., 2015) and slightly lower errors than estimates from very high resolution (Pléiades) satellite stereo-imagery (Marti et al., 2016). While Nolan et al. (2015) point out that piloted aircraft are more useful in remote areas by removing the need for expeditions and owing to the larger survey areas, fixed-wing UAVs have been used to cover areas of ~1 km2 (Harder et al., 2016) and offer the potential for more cost-effective and regular

372

on-demand operational measurements for water storage and avalanche prediction applications (Bühler et al., 2016).

Despite the popularity of the approach, substantial accuracy challenges remain for UAV-based snow depth estimates. The study of Cimoli et al. (2017) evaluated the method in six locations in Svalbard and West Greenland and observed that variable snow surface patterns, lighting con-ditions, vegetation and topography influence the achievable accuracy. Fernandes et al. (2018) note that the use of a snow-free topographic model is perhaps the limiting factor in snow depth estimates as errors were similar to vegetation heights. In an extensive evaluation in the Ca-nadian Rocky Mountains, Harder et al. (2016) suggest that meaningful snow depth estimates can only be obtained where the depth is > 30 cm. Bernard et al. (2017b) note that the use of a summer reference as part of the DoD methodology assumes no underlying changes in topogra-phy during the survey interval, although in highly-active mountainous and arctic environments geomorphological activity and moraine dynamics are both observed to be sizeable (Bernard et al., 2017a). Typically, these studies undertake a straightforward DEM Differencing approach; while this may be appropriate given the magnitude of topographic changes involved, progress using spatially-variable levels of detection and precision maps in geomorphological research (Wheaton et al., 2010; James et al., 2017a) could reduce the magnitude of errors observed and the overall reliability of the technique.

The application of SfM photogrammetry as part of the workflow proves challenging on often rather featureless snow surfaces where image matching algorithms fail to detect sufficient key-point correspondences for accurate surface reconstruction (Smith et al., 2015). To overcome this considerable challenge, Near-infrared imagery has been used owing to the higher contrast and lower reflection on snow-covered areas (e.g. Bühler et al., 2016; Miziński & Niedzielski, 2017; Bühler et al., 2017) and has been shown by Adams et al. (2018) to offer improvements over im-ages in the visible spectrum. 4.5.5 Quantifying glacier melt rates and retreat

The same DEM Differencing approach applied to glacier ice can be used to measure glacier elevation changes and hence melt rates. Rates of glacier terminus retreat can also be quanti-fied. With sub-decimetre errors, the use of UAVs offers improvements in precision over satel-lite-based glacier monitoring systems but also provides spatially-distributed estimates rather than the at-a-point measurements that result from field monitoring using ablation stakes. UAVs also allow regular low-cost on-demand multi-temporal surveys to better identify spatial patterns and controls on glacier melt rates and can supplement or continue longer time series of aerial surveys (e.g. Mölg et al., 2019).

373

Whitehead et al. (2013, 2014) present an early example of UAV-based glacier monitoring of Fountain Glacier on Bylot Island in the Canadian Arctic. Immerzeel et al. (2014) applied a UAV-based glacier monitoring system to a debris-covered Himalayan glacier. While the observed glacier mass loss was limited, the high-resolution, detailed distributed map of topo-graphic change achievable with a UAV revealed very high spatial variability of melt rates. The ability to couple DoDs with orthophotos further enabled them to make the observation that areas around ice cliffs and supra-glacial ponds were often associated with mass losses an order of magnitude higher than average. Wigmore and Mark (2017) observed similar variability and association with ice cliffs in the Cordillera Blanca in Peru, while Seier et al. (2017) detected ice collapses at a lateral crevasse field in addition to mean glacier surface lowering in Pasterze Glacier, Austria. As with snow depth monitoring, sub-decimetre errors are reported in these studies.

The DEM Differencing approach has also been used to investigate moraine dynamics by Ber-nard et al. (2017a) and to capture and quantify a high magnitude catastrophic subsidence event on Dålk Glacier, East Antarctica by Florinsky and Bliakharskii (2019). The collapse of an engla-cial cavern led to the formation of an ice depression up to 43 m deep over an area of ~40,000 m2. With UAV surveys captured ten days before, one hour after and ten days after the event, the development of the subsidence could be observed, triggered by supraglacial water accumulation over a thin cavern roof.

At a larger scale, Ryan et al. (2015) used repeat SfM surveys from a fixed-wing UAV to obtain mass loss estimates from a 5.3 km wide calving ice front on Store Glacier, Greenland. To achieve the larger spatial coverage, a higher flight altitude was required (~500 m vs ~100 m of previous mountain glacier examples) which resulted in DEM errors of around 2 m. However, this is more than adequate given the scale of the glaciological application. More recently, Jouvet et al. (2019a) scaled this approach up even further by quantifying volumetric changes over six calving glaciers in Inglefield Bredning, northwest Greenland. 4.5.6 Ice velocity measurement

Imagery obtained from UAVs is frequently used for feature tracking glacier surfaces to deter-mine glacier flow rates (chapter 3.3). While this can be undertaken by manually identifying objects in multiple co-registered images to obtain movement vectors (e.g. Immerzeel et al., 2014; Dall’Asta et al., 2015; Wigmore and Mark, 2017; Rossini et al., 2018), image correlation tools (e.g. normalized cross-correlation, COSI-Corr) can be used to obtain distributed image-to-im-age displacements between co-registered orthomosaics. Kraaijenbrink et al. (2016) used this technique to identify considerable spatial and seasonal differences in surface velocity for a de-

374

bris-covered Lirung glacier in the Himalayas. Jouvet et al. (2019a) also successfully used this method to obtain ice surface displacement fields for six calving tidewater glaciers in Greenland. However, Whitehead et al. (2013) reported challenges in obtaining sufficient image correlation on relatively featureless ice-covered surfaces. Alternatively, Chudley et al. (2019) used particle image velocimetry software to generate velocity fields across the calving front of Store Glacier, though manual filtering of erroneous values was required. Nevertheless, while there is a more established history of applying feature-tracking techniques to satellite imagery (e.g. Quincey et al., 2009), it is the superior spatial resolution of UAV-imagery that permits lower flow velocities to be detected or smaller survey intervals to be interrogated. For example, local variations in the direction of the velocity field, as detected around a lateral crevasse field by Seier et al. (2017), can be observed clearly from UAV data.

In a related application, McGill et al. (2011) document the deployment of a UAV off the deck of a research ship in the Southern Ocean with the objective of identifying and tracking free floating icebergs. In this case, the iceberg movement was tracked via GPS tags dropped onto the icebergs that communicated regular position reports. Most recently, Jouvet et al. (2019b) used the on-board differential GNSS receiver of the UAV itself as a method of in-situ sensing of glacial motion by landing the UAV on the fast moving Eqip Sermia tidewater Glacier in west Greenland and recording its movement over several hours. 4.5.7 Estimating aerodynamic roughness

UAV-based roughness surveys offer the opportunity to more adequately represent the hetero-geneity of glacier surfaces and better parameterise the ice aerodynamic roughness length ( z 0

) in distributed melt models. The high-resolution topographic data generated via UAV-based LiDAR or SfM photogrammetry is often gridded at ~100–102 m scale horizontal resolution. Yet, the raw point clouds are often of a much higher resolution. Several studies have sought to take advantage of this data abundance to produce sub-grid metrics and thus to obtain distributed maps of ice surface roughness for both sea ice (e.g. Crocker et al., 2012; Wang et al., 2018c) and mountain glaciers (e.g. Rippin et al., 2015; Rossini et al., 2018). Chambers et al. (2020) demonstrate that roughness data obtained via a UAV can be used to obtain an esti-mate of z 0

, which is an important control over turbulent heat fluxes on glaciers. Whereas melt modelling studies often assume spatially and temporally uniform value of z 0

, field evidence notes variability over several orders of magnitude (Brock et al., 2006); although issues with scale-dependency remain, UAVs offer an ability to provide distributed maps of z 0 for use in such melt models.

375

4.5.8 Albedo measurement

Albedo is an additional important uncertainty in ice and snow surface energy balance modelling and plays a crucial role in modulating the fraction of absorbed shortwave radiation. Yet, albedo measurements have been subject to the same data limitations described above for z 0

. Field stud-ies have observed pronounced spatial and temporal albedo variability (e.g. Jonsell et al., 2003); however, the coarse pixel resolution of satellite-based estimates (typically several hundreds of metres) limits their ability to detect this. UAV-based digital imagery has been interrogated by Rippin et al. (2015) who proposed that pixel RGB values can be used to provide a crude ‘albedo proxy’, albeit subject to relatively large errors arising from variable illumination conditions.

Ryan et al. (2017) evaluated the performance of similar image-based analysis of albedo over 280 km2 of the Greenland ice sheet by comparing estimates with the ratio of upward and down-ward reflectance as measured by a pair of broadband pyranometers also mounted on the UAV. A white Teflon reference target was used to convert downward radiation into digital numbers and correct for the variable illumination issue. In the resulting 20 cm resolution albedo field, distinct patterns were observed; the influence of local topographic variability on albedo was pronounced with crevassed areas exhibiting lower albedo values than low relief areas. In an intercomparison study, Burkhart et al. (2017) noted that although UAV-based reflectance estimates were slightly higher than MODIS-based values, they were in close agreement and UAV-based values have potential to provide insight into sub-pixel variability of MODIS data products.

To examine the effect of Saharan mineral dust on snow albedo in the European Alps, Di Mau-ro et al. (2015) combined ground-based high-resolution measurements of reflectance spectra with UAV-derived orthophotographs. They developed a relationship between mineral dust con-centration and the normalized ratio between red and green wavelengths thereby permitting an estimation of the spatial variability of mineral dust deposits and their influence on surface re-flectance. 4.5.9 Surface temperature measurement

Mounting thermal sensors on UAVs offers the potential for distributed maps of surface temper-ature (chapter 2.4). Kraaijenbrink et al. (2018) demonstrate this potential on the debris-covered Lirung Glacier in the Central Himalaya. Given the complex, nonlinear influence of supraglacial debris layers on surface energy budgets (via enhanced radiation absorption when the layer is thin and insulation of the ice when the layer is > 5 cm), an enhanced understanding the var-iability of surface temperature yields important insight into the glacier melt process. Higher

376

surface temperatures are assumed to indicate more effective insulation of the cold ice by thicker debris layers. For ground control, targets were wrapped in aluminium foil to ensure a distinct radiant temperature from the surrounding surface. In common with glacier surface parameters described above, repeat UAV-based thermal imagery of the debris-covered glacier revealed pro-nounced spatial and temporal variability in surface temperatures over the glacier (Kraaijenbrink et al., 2018) including a range of nearly 50°C observed in a single morning. While methodo-logical issues relating to sensor bias and estimation of spatially distributed emissivity remain, the UAV-based measurements revealed pronounced patterns that could not be established from either satellite-based measurement or relatively sparse in situ temperature measurements. 4.5.10 Challenges and future opportunities

As previous sections have demonstrated, the potential of UAVs for progressing cryospheric research is being exploited increasingly. UAVs offer a valuable platform on which to mount a number of different sensors (e.g. radar systems, pyranometers, thermal sensors) to obtain a range of measurements of great importance to the discipline. A common theme emerges in relation to scale and resolution. For the most part, measurements taken from UAVs bridge the scale gap between challenging, time consuming and often expensive and hazardous direct field measurement of glacial properties at a single or several points and distributed yet course satellite estimates of the same properties. The size and remoteness of glaciers and ice sheets dictates that sparse field observations be interpolated; yet, where field measurements do exist, local heteroge-neities are observed that could not be detected from satellite-based estimates.

UAV-based studies of melt rates, surface roughness, ice surface velocity, surface temperature and albedo have each quantified pronounced spatio-temporal heterogeneity. Early work with UAV-mounted radar sensors suggest that heterogeneities are also present on a glacier bed and within snow and ice. The landscape-scale variability in all properties reveals an underlying com-plexity of pattern and process. While this has been regularly observed and reported qualitatively, UAVs now provide the ability to quantify and formalise these observations. Moreover, the ability to derive a wide range of measurements from a single sensor has permitted several authors to examine relationships between each of these; for example, Rossini et al. (2018) quantify the effect of glacial brightness and roughness on surface lowering, all of which were derived from a UAV. Yet, the use of UAVs for each of the above applications remains in its infancy and has yet to be used operationally to provide distributed inputs to surface energy balance models thereby reducing the uncertainties of restrictive of assumptions of spatial and temporal uniformity.

That is not to suggest that UAVs represent a panacea for cryospheric data collection. In the high altitude and/or low latitude locations where cryospheric research is typically undertaken,

377

meteorological conditions are at best unfriendly to UAVs and often exceed all operational lim-its (Bühler et al., 2017). Mountain glaciers are often located in places with poor GNSS signals, while high winds (Arnold et al., 2018) and low air pressure at altitude (Wigmore & Mark, 2017) present further obstacles to undertaking safe and effective UAV surveys. Given these challenging conditions, it is inevitable that UAVs will be lost or experience crash landings (e.g. McGill et al., 2011; Jouvet et al., 2019b; Figure 4.5-3). Poor visibility also reduces available survey time, though Wang et al. (2018c) were able to implement a ‘defogging’ algorithm to extract useable data from UAV-based images during periods of light fog. Moreover, extreme cold weather limits battery life and reduces the time available for aerial surveys. In mountain areas, large and smooth land-ing sites can be challenging to locate; Bühler et al. (2016) note that multirotor UAVs with vertical take-off and landing capabilities offer an advantage in such areas, though Harder et al. (2016) note that fixed-wing systems now have landing accuracies within ~5 m. Meanwhile, in high latitude applications, beyond-visual-line-of-sight surveys mean that landing sites can be tens of kilometres from target survey areas (Zmarz et al., 2018).

Figure 4.5.-3: Poor GNSS signals coupled with mountain winds result in more

UAV take-offs than landings in cryospheric research.

378

Low solar angles of high latitude cryospheric surveys add complexity to studies requiring consistent lighting (Cimoli et al., 2017) and the surface texture of ice and snow can prove especially challenging for photogrammetric surveys. While snow surfaces yield lower survey point densities than other environments, Gindraux et al. (2017) suggest that only fresh snow surfaces are problematic in this regard and that point density improves with each addition-al day. Finally, in common with other UAV applications, legislation continues to evolve and present limits to the use of UAVs in cryospheric research, even in remote locations such as Antarctica (Leary, 2017).

On balance, it seems that the proliferation of UAV-based cryospheric studies in just the last few years indicates that the benefits of UAV-use outweigh the challenges. Clearly, UAVs are set to remain a key component of the glaciologists’ toolbox moving forward.

References for further reading

380

4.6 UAVs in Volcanology

Einat Lev

4.6.1 UAV application for volcano science ..................................................................................... 3804.6.1.1 Imaging applications .................................................................................................. 3804.6.1.2 Non-imaging applications ......................................................................................... 3854.6.1.3 Geophysical measurements ...................................................................................... 3894.6.2 UAVs for volcanic disaster response ...................................................................................... 3914.6.3 Summary ................................................................................................................................... 391Volcanologists study volcanic systems for two main reasons: first is to improve our understand-ing of volcanoes and volcanic eruptions in order to provide better hazard assessment and sup-port risk reduction; second is using volcanoes as portals that connect the Earth’s interior to the outside environment of the biosphere and atmosphere in which we live. Questions that vol-canologists try to answer include: What are the precursors to an eruption? How long before an eruption can we identify the precursors? Once an eruption begins, how will it evolve and when will it end? More fundamental questions include: what is the relationship between the magmat-ic/volcanic evolution of a region and its tectonic history? What is the ratio between the volume of magma that erupts extrusively as lava and ash, and magma that is emplaced intrusively and builds the crust internally? What messages do volcanic products tell us about mantle processes such as plate subduction and plumes?

To answer these questions, volcanologists collect a wide range of observations, depending on the goal of the study as well as the situation (that is, peace time or during unrest of an eruption crisis). For example, to assess the flux of magma at a volcano, scientists need to measure the volume of eruptive products such as lava and ash, as close to the time of the eruption and over as many eruption cycles as possible. To predict how an erupted lava flow would travel from the vent and provide appropriate warning to down-flow communities, scientists must have an up-to-date knowledge of the existing topography and its roughness, as well as how quickly the lava is coming out of the vent (lava flux) and the lava’s physical properties (e.g., temperature, viscosity, density). Measuring the flux and composition of gases emitted from volcanoes is critical – gases

381

released by volcanoes during and between eruptions are harmful to health and vegetation, and at the same time can provide important clues about movement of magma underground and the potential for an imminent eruption.

Unfortunately, collecting observations at volcanoes can be a difficult task. Volcanoes, espe-cially active ones, are usually rugged terrains characterized by unsteady ground and rock sur-faces, such as glassy and fragile lava flows or poorly consolidated ash layers. The topography at volcanoes is often challenging, with many volcanoes towering steeply to high elevations above their surrounding. The eruption products which volcanologists want to study, such as ash layers, tephra and lava flows often extend over large areas. These factors make collecting observations on foot difficult, time consuming, and often dangerous. During volcanic unrest, just before or during an active eruption, it is hazardous or even illegal to approach the volcano, even to collect data. For these reasons, UAVs and their ability to provide access to difficult areas have been revolutionizing volcanology for over a decade now, with a rapid increase in their use since the introduction of low-cost platforms. 4.6.1 UAV application for volcano science

Over the past two decades UAVs have proved extremely useful for volcanologists, providing a wide range of observations, including both imaging and non-imaging examples. The following sections review examples of applications of UAVs to volcanology, for purposes of scientific re-search as well as disaster response. The examples are divided to imaging, which covers applica-tions based on photo or video data, and non-imaging applications, which includes sampling of gases, ash, and water, geophysical measurements, as well as instrument deployment. Interested readers are referred to a recent article (James et al., 2020b) that provides thorough review of techniques, equipment, and applications of using UAVs in volcanology. 4.6.1.1 Imaging applications

As discussed in previews sections, modern small UAVs are frequently equipped with cameras, and collecting aerial pictures and videos are some of the most common uses of UAVs. Cameras mounted on a UAV can be visible-light cameras, or sensitive to thermal infrared or multispectral radiation. Images and videos collected by these cameras can be used for a wide range of applica-tions, which this section reviews.

382

It is essential for volcanologists to know the pre-eruptive topography of the areas to be covered by eruption products such as ash and lava. This baseline topography facilitates accurate hazard assessment as it serves as the input for forward flow models that predict the routing of lava flows, lahars and pyroclastic density currents. It also allows a more accurate assessment of the total volume of eruptive products, and thus the per-eruption magmatic flux. UAVs help obtain topographic data through two main techniques: image-based Structure-from-Motion (SfM) and UAV-mounted Laser Imaging, Detection, and Ranging (LiDAR) units. Both SfM and LiDAR techniques were described in detail in chapters 2.2 and 2.6, respectively. Favalli et al. (2018) conducted both LiDAR and UAV-based SfM topographic surveys of the same 1974 lava flow on Etna, and compared the advantages and disadvantages of each method. The higher spatial reso-lution obtained with the UAV allowed the scientists to capture flow features such as cracks, folds and blocks, which reveal details about flow emplacement rates and dynamics.

SfM applications usually use visible-light images. However, given the thermal anomalies often associated with volcanoes, thermal infrared (IR) images often provide additional data that can be used in SfM analysis. This is particularly helpful when the view of the region of interest (e.g., a crater, a vent or an active fissure) is obstructed by opaque clouds of gases. Lava flows

One of the first notable applications of a UAV to study the emplacement of an active lava flow was during the 2014-2015 eruption of Kīlauea volcano in Hawai’i. An extensive lava flow field erupted from a vent on the eastern flank of the volcano and made its way towards the city of Hilo and its suburbs. Repeated surveys of the flow field with UAVs provided time-dependent topog-raphy data and allowed scientists to measure the flow’s advance rate and volumetric flux, and detect flow inflation and stalling. These measurements fed directly into the flow routing models used by the USGS to provide rapid hazard assessment and update forecasting (Turner et al., 2017). During later eruptions at Kīlauea, UAV-derived topographic and thermal data revealed details of the development of flow breakouts and a tube system (Biass et al., 2019; Dietterich et al., 2018, e.g.,). The use of UAVs for mapping lava flow extent and thickness has been poplar at Mount Etna in Italy. De Beni et al. (2019) documented the diversion and divergence of a 2017 lava flow by a topographic obstacle (a small mountain) that stood in its path.

The 2018 eruption of Kīlauea volcano provided a testing bed for UAVs in the context of an ac-tive volcanic eruptions; Figure 4.6-1 shows an example – Videos captured by UAVs that hovered almost stationary over specific spots along the flow channel yielded unprecedented estimates of

383

fl ow velocity and fl ux and their change over time (Patrick et al., 2019). Th ese videos have also been used to constrain the rheology of the fl owing lava by providing a constraint against which to test numerical models of lava fl ow (Conroy and Lev, 2021).

Figure 4.6-1: UAV-derived observations were key in documenting the evolution of lava fl ows during the 2018 eruption of Kīlauea volcano, Hawai’i. Hovering over the lava channel, UAVs captured videos of the fl owing lavas and documented changes surges in lava fl ux. (A) View of a lava channel during a high-fl ow time. (B) View of the lava channel during a low-fl ow time.

(C) average and (D) cross-channel profi les of lava velocity during high and low fl ow times as measured using particle image velocimetry analysis on the captured videos. From Patrick et al. 2019, reprinted with permission of the American Association for the Advancement of Science.

All Rights Reserved.

Lava fl ows are classifi ed by their surface morphology (e.g., two of the main types of lava deposits, called “pahoehoe” and “a’a”, have smooth and very rough surface morphologies, respectively). Flow morphologies are indicative of the conditions of their emplacement. It is thus useful to characterize and classify past and new lava fl ows. However, assessing morphology on foot can

384

be dangerous, time consuming, and oft en unfeasible over large areas. UAVs have been used re-cently to help classify large areas of lava fl ows. For example, two classifi cation eff orts of the 2018 lava fl ows of Sierra Negra volcano in the Galapagos utilized: 1) machine learning analysis of the orthomosaic visible image of the fl ow (Soule et al., 2019), and 2) a combination of roughness estimates from an SfM-based DEM and the grain size proxy of ground heating rate (Carr et al., in review; See Figure 4.6-2). Both eff orts proved that a few hours of data collection from the safety of a UAV launch point, a large area can be mapped and analyzed effi ciently and accurately. Figure 4.6-2: An example of surface deposit (lava fl ow and tephra) morphology classifi cation using data collected by a UAV. Top left : heating r C/hr, measured by fl ying an infrared thermal camera three times (before, at and aft er sunrise) and measuring the change in apparent temperatures; Top right: Small-scale surface roughness, derived from a 20 cm/pixel DEM constructed from visible light images using SfM; Bottom: Automated classifi cation results using the k-means method with 3 categories. Image reprinted from Carr et al., 2021, with permission from Elsevier. All Rights Reserved.

385

When the lava erupting from the vent is highly viscous, for example due to its high-silica com-position or relatively low temperature, it forms a dome. Domes are particularly dangerous, since they can become unstable and collapse, forming hazardous pyroclastic density currents, even without active eff usion. Such collapses are diffi cult to predict, and present a challenge for hazard assessment and mitigation. In some cases, a dome collapse can relieve enough pressure from the underlying magma to trigger a large eruption. It is therefore important to track the growth, and assess the structure and stability of domes during and beyond their emplacement. Domes are usually inaccessible targets, making them an attractive target for UAVs. Figures 4.6-3 and 4.6-4 show examples of domes documented by UAVs at two volcanoes in Indonesia.

Figure 4.6-3: Topographic change analysis of a lava dome at Merapi volcano, Indonesia. (a)

Shaded reliefs of the 2012 and (b) 2015 Digital Elevation Models and (c) cross-section profi les of lines h–i and j–k show detailed geometry of the open fi ssures and lava dome at Merapi Volcano before and aft er a series of steam explosions that occurred between 2012 and 2014. Coordinates are in UTM meters. (d) Changes in topography reveals the aft ermath of the explosions, where red areas indicate deposited areas and blue areas indicate loss areas. Also an unstable block is identifi ed. Image reprinted from Darmawan et al., (2018), with permission from Elsevier.

386

Figure 4.6-4: Figure 4.6-5: Lava dome stability analysis made possible through UAV topography mapping. Th e lava dome at Sinabung Volcano, Indonesia, grew between 2010 and 2018. Carr et al. (in prep.) performed a UAV survey (A) to gather images, from which they built a DEM that was compared with the pre-eruption DEM from 2010 to reveal dome growth (red) and collapse (blue) sites (B). Th e DEM was used as input to a numerical slope stability soft ware which calculates the Factor of Safety (FoS), a parameter that quantifi es the likelihood of collapse.

For Fos<1, the lower the value of FoS, the more unstable the section. Topo graphic change at volcanic craters

Th e accessibility and relative ease of using UAVs makes them ideal for detecting change through repeat surveys. UAVs that use pre-programmed fl ight paths can easily re-fl y the same path over and over, facilitating highly accurate change detection. Th is ability has already been used in several volcanoes to quantify syn-eruptive change in rapidly evolving crater areas. For example, Smets et al. (2018) measured the amount of lava (6.9×106 m3) that fi lled the crater of Nyamulagira volcano, in D.R. of Congo during lava fountain eruptions 2014. During the 2018 eruption of Kīlauea volca-no, daily surveys of the summit caldera with UAVs produced an extensive data set documenting the collapse of the caldera fl oor in response to the emptying of the underlying magma chamber as magma migrated through the East Rift Zone and erupted as lava fl ows (Neal et al., 2019; Figure 4.6-5). Th e repeated topographic surveys allowed scientists to assess the connection between the summit reservoir and the eruption site and helped improve warnings for changes in eff usion rates. 4.6. 1.2 Non-imaging applications Th e range of observations that volcanologists seek in their pursuit to understand volcanoes bet-ter goes far beyond appearance and topography that can be viewed and documented using im-ages. Th is section reviews non-imaging applications, where scientists relied on UAVs to collect samples, make geophysical measurements, and deploy instruments on volcanoes.

387

As magma ascends within volcanoes, it releases gases such as SO2, CO2, and H2S. Th e amount and composition of the gases being released can change over time and provide an indication that magma is getting closer the surfaces or that fresh magma has been added to the reservoir (e.g., Aiuppa et al., 2007). Th ese signals can suggest that an eruption is approaching, thus providing important information for hazard assessment.

Figure 4.6-5: Th e collapse of the summit caldera at Kīlauea volcano in 2018, as documented by repeat UAV surveys. Th e caldera collapsed in response to emptying of the magma reservoir, as magma fl owed into the East Rift Zone and erupted as lava fl ows. Top: oblique aerial view of the caldera; Middle: Shaded relief maps of the caldera topography from before (2009) and aft er

(August 2018) the eruption. Bottom: change along the cross-section line marked in the maps.

From Neal et al. (2019), reprinted with permission of the American Association for the Advancement of Science. All Rights Reserved.

388

In recent years, UAVs have been equipped with small gas sensors that either collect gas sam-ples to bring back to the ground, or with sensors that measure gas composition and concentra-tion as they fly through the volcanic gas plume. The utilization of UAVs to collect and measure volcanic gas has a significant advantage over past methods such as ground-based gas detectors (e.g., DOAS, UV cameras) or aerial gas surveys using manned helicopters, which puts the crew at a great risk and comes at a large cost. For these reasons, the field of volcanic gases have seen an explosion of applications. Efforts have focused on minimizing sensors so that they can be carried by smaller and cheaper platforms and on combining sensors for different gases, to make data collection most effective (Figure 4.6-6). Ash sampling

Another volcanic product that is important to collect is ash. Ash particles are a major hazard to local communities as they are a respiratory irritant and damage crops and structures. They are also dangerous for aircraft engines and can cause airspace closures. Scientifically, ash par-ticles are important as they are the freshest eruption products and quench immediately upon exiting the volcano, and thus preserve critical information such as ascent rates and magma storage depths. Ash is light and often gets dispersed soon after the eruption, so it is important to collect samples quickly, in competition with the obvious danger of doing so. UAVs can assist with this challenge by collecting ash samples from inaccessible locations and during an eruptive crisis (Figure 4.6-7). Examples where this has been done are reported by Nagatani et al. (2013, 2014, 2018), who collected ash samples from the ground of Asama, Fuj, Izu Oshima, Unzen and Sakurajima volcanoes in Japan using a remotely operated roller-based soil sampler suspended from UAVs. Schellenberg et al. (2019) mounted sticky stubs used in scanning elec-tron microscopes (SEMs) to collect ash while flying through the plume of Volcan de Fuego, Guatemala.

389

Figure 4.6-6: Examples of UAV platforms used for collecting volcanic gas information, and the data gathered by them. (A) Multicopter carrying a multigas and a gas collecting pump-and-bag unit used during the Geldingadalir eruption in 2021 (photo: Yves Moussallam) (B) A minaturized

Flame spectrometer (by Ocean Insight) combined with a Raspberry Pi controller, before being mounted on a multicopter during the 2021 Cumbre Vieja eruption in Spain (photo: Mike Burton). (C-D) A SIERRA fi xed wing UAV that carried a miniaturized mass spectrometer (inset) in its nose compartment, and the data that Pieri et al. (2013) collected with it at Turrialba volcano,

Costa Rica. (E) Measurements (top-right inset) of multiple gas types measured by an Ai450 drone model Aeroterrascan (bottom left inset) carrying a Multi-GAS instrument. Th e background shows the topography of Agung volcano, Indonesia, and the fl ight path. Image from Syahbana et al.

(2019), originally published under a CC BY license (https://creativecommons.org/licenses/by/4.0/)

(F) CO2 sampling by a remotely-operated pump and bag system suspended beneath a DJI Inspire

1 UAV as it is fl own through the plume of Po as volcano, Costa Rica (James et al., 2020a, photo by Fiona D’Arcy. Image originally published under a CC BY license (https://creativecommons.org/ licenses/by/4.0/). Th e collected sample will then be analyzed in the laboratory.

390

Figure 4.6-7: Ash sampling using UAVs. (A) An example of a roller-based ash collector. © 2014 IEEE. Reprinted, with permission, from Yajima et al. (2014). (B) Ash particles collected by sticking to an SEM stub mounted on a UAV, by Schellenberg et al. (2019), originally published under a CC BY license (https://creativecommons.org/licenses/by/4.0/). Water s ampling

Water in volcanic systems oft en interacts with the magma to form hydrothermal systems. As such, changes in water composition and temperature can reveal clues about changes in the sub-surface magmatic system, such as an intrusion of fresh magma. Sampling water from UAVs has been done using non-metallic bottles or pistons fi tted with one-way ball valves that seal as the device is pulled up. Example locations include a lake that formed in the collapsed crater of Kīlauea volcano post the 2018 eruption, the Yugama crater lake at Kusatsu-Shirane volcano, Japan (Terada et al., 2018), and low viscosity water-rich muds at Lusi mud volcnao (Di Stefano et al., 2018). 4.6.1.3 Geophysical measurements

Geophysical measurements such as gravity, electrical conductivity, and magnetism are a common tool in imaging the structure of volcanoes, as they can reveal the properties of volcanic deposits and point to changes in the position of melt and gas pockets. For example, mafi c rocks show a higher magnetic intensity, and silicic rocks show a lower magnetic intensity; magnetic intensity can also track lava cooling and hydrothermal alteration of deposits (Koyama et al., 2013). Conducting geo-physical studies on land at volcanoes can be diffi cult for all the reasons discussed previously. Aerial surveys can help. In recent years, geophysical sensors have become suffi ciently small and light to be fi tted on UAVs and used to survey volcanoes. Examples of aeromagnetic surveys using UAVs come mainly from Japan, and include Izu-Oshima volcano (Kaneko et al., 2011), Kuchinoerabu-jima Volcano (Ohminato et al., 2017), and during the 2011 eruption of Shinmoe-dake volcano, Japan (Koyama et al., 2013). UAV-based gravity surveys are yet to be conducted at volcanoes.

391

UAVs are capable not only of collecting data using on-board sensors, but also of deploying sen-sors that will collect data on the ground (Figure 4.6-8). UAVs have the advantage of being able to place such sensors in inaccessible places, such as an active volcano’s crater or on fresh de-posits, or during an eruption. Sensors deployed this way must be capable of transmitting their data remotely, since in most cases retrieval of these sensors is impossible and they are likely to be destroyed by an eruption. Th ey also must be light-weight and autonomous. For example, small, glass-shelled sensor capsules nicknamed ‘Dragon Eggs’ are being developed and will al-low deployment of fl exible sensor networks at volcanoes. Each Dragon Egg is equipped with sensor packages, including gas sensors (SO2, H2S, relative humidity, temperature, pressure), GPS receivers, and vibration sensors (Wood et al., 2018). Th e UAV deploying the Dragon Eggs is equipped with a remotely operated custom release hook that is triggered once the sensor unit had been ‘placed’ (not dropped) in at the desired site. Ohminato et al. (2017) reported depositing specially-designed seismometers near the vent of Kuchinoerabu-jima volcano, Japan, an area inaccessible in other ways. Th e solar-powered seismometers weighed just 5kg, to fi t within the aircraft ’s payload, and were equipped with an aluminum tripod landing gear to stabilize their placement on the ground. Data was transmitted through a commercial cellphone network, an advantage not always available in remote volcanic areas.

Figure 4.6-8: Instrument deployment at volcanoes by UAV. A) An unmanned helicopter deploying a seismic observation module at Kuchinoerabu-jima volcano, Japan (see Ohminato et al., 2017).

Th e left panel shows the winch and cable system used to lower the seismic observation module onto the volcano; Th e middle panel shows the component of the package, including ground motion sensor, aluminum tripod, solar panel and battery, cellular phone antenna, and GPS antenna.

Photos: Takayuki Kaneko and Takao Ohminato. B) A multirotor UAV (DJI M100) deploying a “dragon-egg” sensor package at Tavurvur volcano, Papua New Guinea to measure fumarole activity (Wood et al., 2018). Th e package included SO2, H2S, relative humidity, temperature, pressure and pressure sensors, GPS receivers, and vibration sensors. Photo: Kieran Wood.

392

Volcanoes are of course not just a fascinating scientific target, but are a source of risk for com-munities who call them home or neighbors. UAVs now take an important role in assisting emer-gency crews and government agencies in their response and management to an active volcanic disaster. During the 2018 eruption of Kīlauea, UAVs flown by the local university and the USGS provided night-time observations of new fissure opening and lava flow advance when no other helicopter or plane could fly. UAVs hovered over the flowing lava and provided real-time esti-mates of lava flux and speed, which the USGS immediately entered into hazard models. More-over, during the 2018 Kīlauea eruption, a UAV was used to guide a stranded residence out from their home to safety. During an active or on-going eruption, governments usually declare a no-fly zone for manned aircraft due to the danger to the crew. This risk is alleviated when using UAVs, that can provide real-time monitoring of the summit or vent areas even they are obstruct-ed by crater walls or a plume of ash or gases, as is often the case. 4.6.3 Summary

The role of UAVs in reshaping volcanology into a data-rich science cannot be overestimated. UAV capabilities are now within reach of even the more cash-strapped scientists and obser-vatories, opening the door to a wide range of data that was previously impossible to collect due to cost, danger, or inaccessibility. The near future will undoubtedly see new applications, as different sensors become small enough for a UAV to carry or deploy and data collection and processing methods mature.

References for further reading

394

4.7 UAVs in agriculture

Dirk Hoffmeister

4.7.1 Spectral data .............................................................................................................................. 3954.7.2.1 Thermal........................................................................................................................ 3964.7.2.2 RGB imagery ............................................................................................................... 3964.7.2.3 Multi- and hyperspectral ........................................................................................... 398

4.7.3 Structural data .......................................................................................................................... 4004.7.3.1 Multi-temporal approach .......................................................................................... 4004.7.3.2 LiDAR .......................................................................................................................... 4014.7.4 Further applications ................................................................................................................. 401Agricultural intensification is necessary for feeding a rapidly growing human population (1 %/a; currently 7.8 bn will reach 10 bn by 2057). This agricultural intensification needs efficient irriga-tion, fertilisation, and pest control. An overapplication of these components is a financial waste and leads to environmental threads, such as soil erosion and degradation. The most important factor is the nitrogen (N) – application (70 % of all fertilizers), which shows only an efficiency of 30–50 % and the overapplication leads to extensive nitrate concentration in rivers and seas, as well as contamination of ground water. Likewise, global warming (e.g. increased temperatures, higher variability of rain) and restricted access to fertilizer (increasing costs) and water (more irrigation needed) are further future effects on agricultural productivity.

Therefore, agricultural production must be as efficient as possible to maximise food produc-tion, while minimizing effects on the environment (e.g. by using too much fertilizers or water for irrigation). For an efficient nitrogen application, representing every other stated factor, the ‘right rate, right type, right placement, and right timing’ (Houlton et al., 2019, p. 867) is important. Thus, an accurate and easy measurement of a plant status and allowing to adjust management measures is important. These targets are summarized in the area of precision agriculture, which aims to optimize all management tasks on field-level. Therefore, biochemical, and biophysical properties must be monitored, and homogenous zones for management should automatically

395

characterized and delineated. This overall aim can be subdivided in research areas of pheno-typing and yield assessment, as well as abiotic and biotic stress detection (Olson and Anderson, 2021) (see Figure 4.7-1).

For this purpose, aerial images, also with infrared bands, have been in use for disease de-tection since the 1920s (Colwell, 1956; Gerten and Wiese, 1987). First applications with pet-rol-based fixed-wing UAVs or helicopters were conducted in the early 2000s, but also the solar powered NASA’s Pathfinder-Plus UAV was used for agricultural monitoring tests (Herwitz et al., 2004). Early examples are the successful application to shrub estimation (Quilter and Anderson, 2001), estimating plant biomass and nitrogen content with a multispectral imaging sensor (Hunt et al., 2005), the documentation of water stress in crops (Berni et al., 2009), and mapping range-land vegetation (Laliberte and Rango, 2009).

The quick acquisition at critical points during the growing period is an advantage. In contrast to satellite imagery, which needs to be ordered and planned in advance, no cloud obstructions occur in UAV-based surveys. Likewise, all other advantages of UAVs, e.g. covering larger areas fastly, in combination with miniaturisation of sensors are of importance for this area of ap-plication. Main advantage and research target are the non-destructive determination of plant parameters, such as plant height. In general, UAV-based surveys are cost-effective, and enable to acquire high-resolution (resolution in centimeter ground sampling distance) images, which are needed for applications in precision agriculture. In contrast, lighting conditions strong-ly affect acquisition and results. Good weather conditions, two hours around local noon, the recommended time slot, are not always the case. Likewise, all regulations concerning UAV applications, access, and ownerships rights (chapter 1.4), close by habitat areas and birds of prey are actual problems, which may prevent or disrupt an acquisition. Further, the complexity of applying thermal, multi-, and hyperspectral sensors, analysis steps, as well as the acquisition of stable, high accurate ground-referencing points might be additional barriers for these appli-cations.

However, applications of UAVs in agriculture are manifold and one of the major application areas of UAVs at all. Mainly, crop height and crop growth distribution, yield estimations, and crop health status, as well as disease detection from pathogens, weeds and insect problems are areas of research (see Figure 4.7-1). All of the approaches aim to finally increase effectiveness of input and optimize output of any crop, also in terms of environmental protection, by adjust-ed farm management. Therefore, single surveys or multiple surveys over time are conducted, ranging from single overviews to enhanced analysis using nearly every sensor shown in previ-ous chapters. Overall, the application of UAVs in agriculture is a huge market, and in terms of environmental protection an important factor. Most often, UAV-based applications are used for phenotyping and further experimental trial areas, in general agricultural management, viticul-ture and horticulture. Phenotyping thereby is an area of science, which focuses on the complex

396

interaction between a specific genotype and the environment in which the plant develops in order to monitor plant breeding.

Figure 4.7-1: Overview on selected issues and actions in agricultural applications of UAV-based sensors. Prepared by the author for this chapter.

In this chapter, basic principles and examples of corresponding applications will be shown. The chapter is divided into passively recording spectral sensors (thermal, RGB, multi- and hyperspectral approaches) focusing on biochemical plant properties, a short review of achiev-ing biophysical, structural plant data, such as plant height, finally ending with further specific sensors. 4.7.1 Spectral data

Spectral sensors capture the partly reflected electromagnetic radiation of surfaces, ranging from visible (~400 nm) to near infrared, also called thermal infrared (~14 µm). The sensors record, as introduced in chapter 2.5, this radiation in wavelengths of different ranges as bands, whereas multispectral sensors use up to ten bands with uneven ranges and hyperspectral sen-sors use about 200 bands with narrow, even bands. Thus, usual RGB-based cameras are a multispectral sensor, capturing reflections by three bands located in the visible area of the electromagnetic spectrum (~400–700 nm). The reflection is stored as digital numbers (DN)

397

per band and needs further calculations or calibrations to retrieve the reflectance. Different surfaces show different reflectance patterns and calculations on different bands resulting in indices allow to estimate the status of a surface, e.g. its temperature or vitality. The received values of the before mentioned multi- and hyperspectral measurements, the band values per pixel, can be used to calculate indices. The Normalized Difference Vegetation Index (NDVI) is the most commonly known index for plants, reusing the knowledge-based relation of green plants, not highly reflecting in the red area of the electromagnetic spectrum, but highly re-flecting in the near infrared area:

() ()=

+ NIR RedNDVINIR Red where Red and NIR represent measured values from the specific wavelength area of visible red and near-infrared reflection, resulting in values between -1 to 1. This easier index is widely used for the estimation of a plant´s health status and its spatial distribution and in general for vegeta-tion detection, but this index shows a saturation after a certain plant development. R² values in relation to biomass greater than 0.5 are typically found. 4.7.2.1 Thermal

As pointed out in chapter 2.4, cameras acquiring thermal-infrared (TIR) information are for UAV applications uncooled instruments, allowing only a lower resolution. However, these instruments are widely used in agriculture applications, as the provision of the plant water status is possible. This is an important information for irrigation monitoring and irrigation will be even more important in the future, due to climate change. Common indices build on the derived temperatures (plant canopy and air temperature), such as the CWSI and WDI, shown in chapter 2.4 are applied. Likewise, the thermal information helps to detect diseases and lodging (Liu et al., 2018) as well. Mostly, the complexity and a warm-up time for the cameras is denoted. 4.7.2.2 RGB imagery

An orthomosaic derived from RGB images already allows a general overview by visual inspec-tion, e.g. colour and density differences, as well as areas of lodging are easily manually detectable and also measurable, which is an important result for insurances. In addition, the RGB informa-

398

tion is usable to detect these kinds of differences in a crop stand by data-driven approaches. For instance, OBIA, an image-based segmentation into homogenous areas is used, which is followed by classification of the image segments in order to derive areas with invasive species (e.g., Al-berto et al., 2020; Peña et al., 2013; Wijesingha et al., 2020). For the segmentation, information derived from the RGB images, by transforming to another colour model, here intensity, hue and saturation (IHS), can be used for successful classifications (Laliberte et al., 2010).

Figure 4.7-2: Orthomosaic and derived RGBVI (bottom) from a grassland trial site showing greener, healthier areas with a higher index value (green colors) then areas with low vitality or with bare-earth (red colors) (details in Possoch et al., 2016).

399

Likewise, lacking of integrable sensors and trying to use low-cost solutions, specific digital RGB-cameras, which can be modified to acquire NIR light, were applied (called ‘modified CIR’). This is conducted by removing the internal hot-mirror filter and replace a blue-blocking filter in front of the lens. With a radiometric calibration and extensive post-processing, the raw digital camera image can be converted into a red, green and NIR false-colour image, which can be used to provide normalized difference vegetation index (NDVI) images, delivering similar results as obtained from the multispectral cameras. Finally, an implementation in an effective crop health monitoring, allowing to react is possible. Hunt et al. (2010) for instance, found that modified CIR, allows to calculate the green normalized difference vegetation index (GNDVI) and a re-gression model can be derived with an R² of 0.85 to the leaf area index (LAI), which represents a structural plant parameter as the relation of the leaf area to a given unit of land area.

Further, machine learning approaches can be used on RGB imagery in order to estimate lodg-ing (Zhang et al., 2020) and in addition, spectral indices are built on the three spectral bands (Red, Green, Blue) of the reflected visible light. The indices are for example, the Greeness In-dex (GI) (Gitelson et al., 1996) or the triangular greenness index (TGI) (Hunt et al., 2011). An example of an RGB-based analysis is given in Figure 4.7-2, where a grassland experiment with differently fertilized plots is shown and the RGB Vegetation Index (RGBVI) was derived. The latter shows areas of more developed grass from more fertilizer in contrast to areas with less fertilizer and corresponding less developed plots of grass, as well as areas harmed by lodging or destruction from animals. All details are presented in Possoch et al. (2016). These indices can also be related to the distribution of chlorophyll content or can be reused for instance to esti-mate and monitor the vegetation cover or the gaps within, calculated as the vegetation fraction (Torres-Sánchez et al., 2014). 4.7.2.3 Multi- and hyperspectral

Multi- and hyperspectral sensors measure the passively reflected amount of light of objects in specific bands, covering selected wavelength ranges of the electromagnetic spectrum. The infor-mation additional to the visible, commonly used, RGB-sensor information enables to further distinguish objects and in particular enables to estimate a plant´s health status. This is conducted by using data-driven approaches, band combinations or relations set to corresponding measure-ments of plants, e.g. the biomass.

However, particularly for multi- and hyperspectral sensors, which measure digital num-bers per pixel calibrations for further analysis and comparisons are necessary. More details on these effects can be found in chapter 2.5. Besides a geometric calibration, a radiometric calibration is necessary. Mostly the sensors are pre-calibrated (e.g. concerning vignetting

400

effects, sensor contamination between bands and corrections for different exposure times, apertures sizes or ISO settings) and only a short calibration before and after each flight is necessary, in order to adjust to actual environmental conditions. This is usually conducted by capturing images of a white reflectance panel with known reflectance values. In addition, several sensors are coupled with illumination sensors (‘Downwelling Light Sensor (DLS)’ or ‘sunshine sensors’) capturing the actual lightning conditions of each image during the flight, for correction in post-processing, most effectively for completely cloudy conditions (over-cast). For more accurate results a smoothing of the DLS data might be necessary (Olsson et al., 2021). The DLS readings cannot correct shadowed parts of images or are less reliable in constant illumination environments (e.g. sunny, clear days), as the error of the sensor is high-er. Another possibility for the changing illumination conditions during flights are ground-based illumination sensors. Coping with atmospheric conditions, particularly with higher flight heights, correction is also possible by integration atmospheric modelling approaches, mostly known from satellite imagery analysis.

As a further procedure for radiometric correction, the empirical line method/calibration (ELM or ELC) can be used. ELM derives the coefficients needed to fit uncalibrated or adjust multispectral images (chapter 2.5). This calibration is conducted by placing several levelled, larg-er calibration panels in different black to white colours in a central location within the flight path of the UAV platform. Spectral measurements for field calibration are taken on calibration targets with a field spectrometer in the same spectral range (e.g. 350–1,050 nm) as the sensor and at nearly the same time of image acquisition. The reference spectra are to be used later for the em-pirical line calibration method. For multi-temporal approaches, a calibration can be based on similar targets and procedures, or on artificial targets, such as roads, parking areas, traffic paint-ings, and buildings. In general, it is recommended to compare sensor values with ground-truth values, sensors need a warm up time of several minutes.

Besides the above mentioned NDVI-index, a huge number of other indices exist, which are used in agricultural applications. Some of these enhance the NDVI (green normalized difference vegetation index, GNDVI; red-edge normalized difference vegetation index, RENDVI), adjust for bare-soil influences (soil-adjusted vegetation index, SAVI; optimized-SAVI, OSAVI) or esti-mate chlorophyll content (green chlorophyll index, GCI). An overview on these indices is given on https://www.indexdatabase.de/. Likewise, the implementation of information in the short-wave infrared wavelengths (SWIR) is possible (Jenal et al., 2020). The correction of the NDVI by a fraction cover enhances the estimation of leaf nitrogen content (Xu et al., 2021). Other possibilities to achieve insights in plant health distribution are statistical and enhanced machine learning based approaches in order to use the sensor values in relation to plant parameters. These are, as examples, multiple linear regression and stepwise multiple linear regression, the previously shown OBIA-approach, as well as partial least square regression and random forests.

401

The additional information of the non-visible reflection range allows to enhance classification results in any kind of application.

Results of index-based and other calculations are often related to the LAI. Several devices for ground truth measurements, which are more time consuming, are available. The LAI is an indi-cation for yield, and is useful to determine the correct amounts of pesticides or fungicides that are needed to protect a crop. LAI can also diagnose the nitrogen status for timely correct applica-tions of fertilizers to boost yield. In addition, LAI is an important parameter for modelling mass and energy exchange between the biosphere and atmosphere, and connected to photosynthesis, evaporation, rainfall interception, and carbon flux.

4.7.3 Structural data

4.7.3.1 Multi-temporal approach Another approach of achieving biomass indications of a crop, instead of using relations of indi-ces, is to use multi-temporal height calculations. Therefore, multiple derived digital crops sur-face models (CSMs) or the original point clouds of a crop stand are established. The CSMs, as DSMs of a crop, are built by using the dense 3D point cloud achieved from image matching algorithms (chapter 2.2), representing the top canopy. The 3D information can be stored as a 2.5D raster image. By building the difference between different points in time, the crop devel-opment (growth or decline) and the crop height by using a bare-earth height or base model, can be achieved in a high spatial resolution (Hoffmeister, 2016; Hoffmeister et al., 2010). This is represented by the following equations:

CH t

= CSM t

– DTM CD t2–t1

= CSM t2

– CSM t1 where CH t is crop height at a time t , derived from the CSM t minus the digital terrain model (DTM, or bare-earth model); crop difference CD t2-t1 is the difference between CSMs of certain points in time.

The bare-earth height, or DTM, is either a result of a survey before crop emergence or an esti-mated value, e.g. by interpolation of open areas or reconstructed by manual measurements. The derived crop height or any statistical size per area (e.g. median, minimum, maximum or per-centiles) compared to manual measurements or biomass usually result in high regression results (e.g., Gilliot et al., 2020), also allowing to establish models for yield estimations. In addition, this

402

biophysical information plant height can be combined with previously mentioned indices in order to enhance estimations (Bendig et al., 2015; Lu et al., 2019). 4.7.3.2 LiDAR

Small LiDAR sensors (chapter 2.6) are also applied to agriculture in order to estimate bi-omass amount and distribution, as shown before. In contrast, these systems actively send laser impulses and their reflected signal is captured. As the signal travels by group velocity, light-velocity in atmosphere, the distance can be accurately calculated (‘time-of-flight’-prin-ciple). With accurate angle determinations and connected INS, a 3D point cloud can directly be achieved, contrasting the photogrammetric approach, which needs the recalculation of images. The intensity of the reflected signal allows to distinguish objects and the signal partly penetrates through vegetation allowing to achieve ground or bare-earth points. The latter enables to achieve plant heights without a multi-temporal approach and shows a density information.

For example, Zhang et al. (2021) used Velodyne’s HDL-32E UAV LiDAR and the Riegl VUX‐1 UAV LiDAR system to study grassland and showed that these sensors are capable of effectively extracting vegetation parameters and derive above ground biomass. For this purpose, as typical for laser scanning applications, the raw point clouds as a result, are classified in different groups, e.g. ground, vegetation, trees, and noise. Afterwards, ground points are used for a bare-earth, digital terrain model reconstruction and the digital surface model representing the crop height, as shown before, is used to build a difference. Likewise, the fractional canopy cover, as the ratio of the number of vegetation returns to all returns for a given area is used in order to represent the density of a crop surface. Both factors then are used in linear and nonlinear regression models to estimate biomass by (usually dried) samples. From derived canopy height and fractional vegeta-tion coverage (FVC), aboveground biomass is derived by the first two components as predictors (R2 = 0.54). It was also shown that different flight heights ranging from 40 to 110 m only have a minor influence on the results. Insights in wheat plant structure and development over time for an entire agricultural field by using multispectral indices in combination with LiDAR-results are presented by Bates et al. (2021).

4.7.4 Further applications

Another bigger part of research is spraying or sprinkling systems on UAVs, which allow auto-matically to apply pesticides or nutrients at the right spot within a field. For example, this spot

403

wise, effective method allows to save investments in fertilizers, which is particularly important for orchards and viticulture (Martinez-Guanter et al., 2020). Likewise, workflows are enhanced (e.g. by cloud-computing of UAV-based imagery) for decision making processes on a farm. The derived data can be used to automatically adjust spraying amounts of a tractor. Instead of in-directly measuring reflectance and adjusting these values to biochemical or biophysical plant parameters, sensing of the solar induced chlorophyll fluorescence is possible, closely related to important photosynthetic activity of a plant. However, the relatively low intensity of the signal, a comparison of upwelling radiance and downwelling irradiance is challenging (Bendig et al., 2020; Vargas et al., 2020). Active sensing by using artificial lights might overcome illumination problems from passive remote sensing (Li et al., 2018). A combined approach of thermal, mul-tispectral and LiDAR also enables to accurately determine soil salinity (Ivushkin et al., 2019), representing that a mixture of data might help to most accurately estimate agriculture parame-ters.

References for further reading

404

4.8 UAVs in conservation research

Serge A. Wich, Denise Spaan, Lilian Pintea, Russell Delahunty and Jeff Kerby

4.8.1 Traditional methods ................................................................................................................ 4034.8.2 Land-cover classification and change detection................................................................... 4054.8.3 Animal detection ...................................................................................................................... 4114.8.4 Poaching .................................................................................................................................... 4174.8.5 Discussion ................................................................................................................................. 418Conservation needs to address animal distribution and density, habitat conservation, and poaching. Widespread and steep declines in biodiversity have and will continue to result from anthropogenic pressures (Maxwell et al., 2016; Powers & Jetz, 2019). These include land con-version for agriculture, infrastructure and urban areas; killing animals for food or as a result of human-animal conflict; disease; pollution and climate change (Benítez-López et al., 2019; Lanz et al., 2018; Spooner et al., 2018; Leendertz et al., 2017; Carvalho et al., 2019; Strona et al., 2018; Cooke et al., 2019; Johnston, 2019). To manage biodiversity and mitigate anthropogenic threats there is an urgent need for scale-appropriate information on the distribution and density of plants and animals, and land-cover classification and change over time. 4.8.1 Traditional methods

Traditionally animal distribution and density data have been collected using an array of differ-ent survey methods, primarily through terrestrial (e.g. line transects), marine (e.g. with ships) or aerial (with occupied aircraft) approaches where researchers collect data on animal presence and numbers along transects, in plots, or from point samples (Buckland et al., 2001, 2004, 2010; Franklin, 2010). Data on species’ distribution are often used in combination with environmental

405

layers to model species’ distributions (Franklin, 2010). To derive animal density from observa-tions researchers commonly use analyses based on the distance of the individual/group from the transect or point (Buckland et al., 2010). Such data from distributions and densities have also been combined into density-distribution models, which result in an overall abundance for species (e.g. Voigt et al., 2018; Santika et al., 2017).

In addition to observations from the ground and occupied aircraft, researchers have been us-ing camera traps and autonomous acoustic recorders to obtain data on animal distribution and density (Campos‐Cerqueira & Aide, 2016; Wrege et al., 2017; Marques et al., 2013; Ahumada et al., 2020). These techniques have led to a wealth of data and knowledge, particularly from elusive species that are otherwise difficult to obtain data from through other survey means.

Despite the existence of these survey techniques, we still lack data on the distribution, and particularly population density of many species, contributing to them being classified as data deficient on the IUCN Red List (Jetz & Freckleton, 2015). As a result, within primates we do not have total abundance estimates for most species, or for those we do time series are lacking which hampers our ability to determine trends in abundance (White, 2019). For instance, we only have species-level abundance estimates for a few of the great ape species, a set of species that are par-ticularly well studied (Voigt et al., 2018; Wich et al., 2016a, 2019). This severe lack of density and distribution data is partially a consequence of the high costs that are associated with surveys and the lack of funding for such data collection (Jetz & Freckleton, 2015). There is therefore a strong need to develop new methods to obtain animal distribution and density data.

Simultaneously, there is a need to map and monitor the habitat that animals reside in. To support conservation efforts, it is often necessary to classify the various land-cover types that a species occurs in and monitor change in those over time. Often such land-cover classification and change detection are conducted using satellite images using a variety of the bands in the electromagnetic spectrum (Horning et al., 2010). There are however several potential challenges with the use of satellites. First, in the humid tropics and the Arctic persistent cloud cover is a challenge for obtaining cloud-free images at regular intervals (Hansen et al., 2008; Mulaca et al., 2011). Second, the interval between the data that the sensors on satellites collect is often prede-termined by the orbit of the satellite, and tasking satellites for data collection at other moments comes at a premium price. Third, even though the resolution of freely available images acquired by satellites is improving, it may not meet the requirements of specific conservation projects. Although higher resolution satellite data may be available, the costs of such images is often pro-hibitively high for conservation projects.

Drone mounted sensors may be able to provide data on both animals and their habitats, and at lower costs and more scale-appropriate resolutions than alternative methods (e.g. very high-resolution images of the canopy of a tropical rainforest, (Wich & Koh, 2018; Anderson & Gaston, 2013)). In this chapter, we will review three aspects of conservation for which drones

406

are being used: 1) classification of land-cover types and changes therein for areas in which animals occur; 2) obtaining data on the distribution, behaviour, and density of animals; 3) an-ti-poaching efforts. 4.8.2 Land-cover classification and change detection

Land-cover classification and the detection of change in land cover over time are important aspects of conservation and drones have been used for such studies in areas ranging from the Arctic to the tropics (Wich & Koh, 2018). Land-cover maps can provide a wealth of information for conservation decision-makers. For instance, land cover maps can inform conservation man-agers about the land cover types constituting the home range of an animal species of interest, in which areas they occur most often, and in which part of the home range they sleep most often. Subsequently, the land-cover change detection can provide crucial information on which areas of an animal’s home range experienced the most loss or conversion to another land-cover type. A variety of sensors have been mounted on drones to acquire images that can be used for land-cov-er classification and land-cover change detection (Wich & Koh, 2018; see Box 1). Because con-servation researchers often need to map relatively large surface areas, the use of fixed-wing drones instead of multirotor drones for land-cover classification studies may be more effective (see Box 2). Most commonly, researchers use visual spectrum cameras (Red Green Blue (RGB)) to obtain images and then process these images using Structure-from-Motion (SfM) software to obtain orthomosaics (chapter 2.2). The resulting orthomosaics and digital surface models (DSMs) are then processed further to classify land-cover types or detect specific features such as a particular tree species (e.g. Wich et al., 2018; Reid et al., 2011; Laliberte et al., 2007; Cunliffe et al., 2016). Because traditional RGB cameras capture non-radiometrically calibrated spectral data, they can be less effective than radiometrically calibrated cameras in traditional spectral classification workflows. This particularly relates to land-cover classification and change detec-tion as well as the ability to calculate vegetation health indices (e.g. Normalized Difference Veg-etation Index (NDVI), Green NDVI (GNDVI), etc, Assmann et al., 2019; Michez et al., 2016). Despite this, there are examples of RGB images being used successfully to determine land-cover classes, leaf-area index, and vegetation (Wich et al., 2018; Liu & Wang, 2018; Silver et al., 2019). Hyperspectral cameras are increasingly being used, but their high cost and increased logistical and processing burdens currently limit their widespread adoption (Mitchell et al., 2012, 2016). Most studies use pixel or object-based supervised or unsupervised classification methods that use reflectance for land-cover classification (Wich et al., 2018; Dunford et al., 2009; Laliberte & Rango, 2009; Fraser et al., 2016), but some take advantage of the point clouds that are generated during SfM to distinguish vegetation types by vegetation height (Cunliffe et al., 2016). The ac-

407

curacy with which land-cover classes are classified varies extensively for the different land-cover classes within and between studies. These differences are associated with variability in methods, scale of inquiry and measurement, as well as the available data (review in chapter 7 of Wich & Koh, 2018). Selecting the best analytical method and scale of observation for a particular set of data is therefore not straightforward (Levin, 1992) and when feasible carefully considering spectral and/or spatial grain requirements and testing several methods could offer advantages. The majority of studies have focused on one-off mapping of land cover and land-cover change mapping with drones for conservation is still rare (Wich & Koh, 2018; see Box 2 for an example).

Drones have not only been used for mapping terrestrial areas but also for marine systems where they have been used to monitor shoreline environments (Mancini et al., 2013), mapping of coral reefs and their (Muslim et al., 2019; Etienne et al., 2015), and mapping seagrass cover-age (Duffy et al., 2018). As for terrestrial monitoring, drones will augment data collection on the ground, from occupied aircraft and satellites but likely not replace any of these completely (Johnston, 2019).

The high spatial resolution data that can be obtained with drones can potentially also be important for Payment for Ecosystem Services (PES) mechanisms such as REDD (Reducing Emissions from Deforestation and forest Degradation) which have gained a large amount of interest in conservation (IPCC, 2007; Panfil & Harvey, 2016). An important component of such mechanisms is measuring above-ground carbon content for which drones can be used (Jones et al., 2020; González-Jaramillo et al., 2019). Such carbon mapping is not only important for estimating carbon content of forest landscapes but also in blue carbon ecosystem (mangroves, seagrasses, and salt marshes) (Jones et al., 2020; Pham et al., 2019). Box 1: Using drones with a multispectral sensor to classify tree species in Tanzania

Ever since the development of the Normalised Difference Vegetation Index, or NDVI (Tu-cker, 1979) near-infrared radiation (NIR) has become the essential component in remotely sensed vegetation assessments. Near-Infrared (NIR) radiation is widely considered to be an essential component in remotely sensed land cover assessments. NIR and its relationship with regards to other wavelengths – Red in particular – has led to the development of some valuable classification indices, such as the Normalised Difference Vegetation Index, or NDVI (Tucker, 1979). The high sensitivity of all types of vegetation to NIR radiation makes it an extremely useful tool for large-scale land cover assessments (Townshend et al., 1991; De-Fries et al., 1995) and discriminating between various types of vegetation (Running et al., 1995; Schmidt & Skidmore, 2003). In this example, we investigate whether subtle variations

408

in NIR reflectance can be used as a predictor for identifying tree species in a miombo region (vegetation dominated by Brachystegia and Julbernardia species) of Western Tanzania. Tree health, height, size, prevalence, and leaf density are all potential factors that will influence NIR reflection and contrive to produce a spectral signature unique to each tree. If the spectral signature of individual trees is representative of that particular species as a whole, then there is potential to map large areas of forest and quickly discriminate tree species from above. This ability to perform rapid biodiversity assessments of the miombo woodland would have benefits for ecologists and conservationists alike.

Methodology. The data were collected at the Greater Mahale Ecosystem Research and Con-servation (GMERC) camp in Western Tanzania (5°30’14.59”S, 30°33’44.49”E). The local landscape is characterised by miombo woodland, wet and dry grasslands, and well-establis-hed gallery forest in riparian valleys (Piel et al., 2015). Ground truthing assessment took place over two research trips (2018 & 2019) and data on over 400 individual trees in areas of miombo woodland were recorded. These data comprised the tree species, DBH, qualitative canopy description (to assist orthomosaic delineation) and the GPS coordinates for each tree, taken from the trunk. RGB data capture (included in Figure 4.8-1 as a visual reference and not used for analysis) was carried out using a DJI Mavic Pro in conjunction with the DJI GS Pro flight planning application. Multispectral data were obtained by flying over the ground-truthed areas at 120 m (400ft) and using a high side-lap setting of 90 %. The aircraft was a custom-built DJI F550 hexacopter operating a PIXHawk flight controller, carrying a Parrot Sequoia multispectral sensor set-up to capture data in four bands, Green (530–570 nm) Red (640–680 nm) Red Edge (640–680 nm) and Near Infrared (770–810 nm). A calibration panel of known reflectance values was photographed before each flight to allow for images captured under changing illumination conditions to be spectrally corrected and standardised during processing. The recorded data were processed with Pix4D software to produce single-band grayscale reflectance maps that were subsequently combined to generate a 4-band false-co-lour orthomosaic using ArcMap (Figure 4.8-1). Tree GPS coordinates were overlaid onto these rasters and the laborious process of canopy delineation began. A total of 377 canopies across 14 different species were determined to be identifiable and selected for analysis. Quan-titative pixel data was extracted from each raster band (Green, Red, Red Edge & NIR) for each identified tree, and the data combined into basic statistics for the 14 different species (Min, Max, Mean, Median & Std Dev). These descriptive statistics were then analysed in R Studio under a range of classification techniques, such as Linear Discriminant Analysis, Sup-port Vector Machines (SVM) and Random Forest, to find the highest accuracies and the best approach for tree identification.

409

Results. Th ese data were analysed under two groupings, Species and Genus. A highest classi-fi cation accuracy of 61 % was achieved for the dataset containing fourteen species using a va-riant of SVM. Th e same analysis also gave the best classifi cation accuracy of 75 % when using the genus dataset, which further reduced the data to7 classes. Th e more established techniques of SVMs and Discriminant Analysis generated the highest accuracies, whilst newer approa-ches such as Neural Networks tended to struggle. Th is trend was not readily apparent during earlier testing using larger, full-pixel datasets rather than condensed averages, suggesting that Neural Networks require a much larger dataset to generate better prediction accuracy.

Figure 4.8-1: Left panel – Traditional RGB orthomosaic of the study site, GMERC Camp,

Western Tanzania. Middle panel – False colour orthomosaic of the study site (showing the same area as Figure 4.8-1) displayed through NIR, Red, and Green channels. Right panel –

False colour orthomosaic of the study site, with six of the most numerous tree species displayed as delineated canopy polygons. BB – Brachystegia boehmii; BM – Brachystegia microphylla;

BS – Brachystegia spiciformis; JG – Julbernardia globifl ora; MA – Monotes africana; PC – Parinari curatellifolia. All fi gures were prepared by the authors for this chapter. Box 2: Community monitoring of land-cover change in Tanzanian forest reserves

Forest ecosystems are threatened worldwide by human activities like conversion to agricultu-re, settlements, charcoal production, mining and logging. Th ere is a need to monitor the sta-tus and trends in forest cover, forest structure and threats at temporal and spatial scales mea-ningful to inform local decisions. Drones could be powerful tools to quickly collect, visualize

410

and share detail information on forests and human activities at the local scales and monitor land cover and land use change as part of a participatory decision-making process. In this box we give a brief overview of how local communities and government decision makers could potentially use drones combined with participatory mapping approaches to monitor the en-forcement and implementation of village land use plans and community own protected areas.

The Greater Gombe Ecosystem (GGE) is an area of 640 km2 located on the eastern shore of Lake Tanganyika in the Kigoma region of western Tanzania (Pintea et al., 2016). It includes Gombe National Park and adjacent community lands covering 27 villages. At Gombe stud-ies of wild chimpanzees ( Pan troglodytes schweinfurthii ) began in 1960 with the research of Dr. Jane Goodall (Wilson et al., 2020). Over the last few decades, there has been significant deforestation and environmental degradation outside the park. In addition to chimpanzee habitat loss and fragmentation, deforestation in hilly terrain of the ecosystem also resulted in unstable watersheds, threatening local settlements with more frequent and severe landslides and flash floods (Pintea et al., 2012).

In 1994 the Jane Goodall Institute (JGI) started the Lake Tanganyika Catchment Reforesta-tion and Education (TACARE) program designed to engage communities as key stakeholders in forest and chimpanzee conservation. Now known as Tacare, it represents the Jane Goodall Institute’s (JGI) community-centered conservation approach. Tacare is rooted in participa-tion and inclusion. By directly engaging with local communities, a holistic approach develops understanding of how people are connected to ecosystems: combining traditional knowledge with science and appropriate use of innovative technologies, such as drones.

The Tacare process ensures that local communities own and drive the conservation effort on their lands. It includes facilitating local communities to secure land tenure and rights to natural resources according to the government land policies. In the case of Tanzania this in-volves a Participatory Village Land Use Planning process that facilitates local communities to resolve any land disagreements and agree on village boundaries and land uses to meet specific community needs from access to clean water to farmland. This forms a foundation for other interventions targeting natural resource management, health, and sustainable livelihoods.

In GGE, participatory village land-use plans were prepared by the communities and facil-itated and supported by JGI, including the use of high-resolution satellite imagery, GIS and other mapping tools.

By 2009,13 villages voluntarily assigned 9,690 ha, or 26 %, of their lands as Village Land Forest Reserves (Pintea, 2011). The location of these reserves was guided by one spatial vision developed as part of the GGE Conservation Action Plan (GGE-CAP, 2009) that resulted in an interconnected network of village forest reserves that covered 68 % of the original historic chimpanzee habitat in GGE (Pintea, 2007).

411

Figure 4.8-2: Upper panel showing part of the orthomosaic from 2015, middle panel showing part of the orthomosaic in 2016 and the lower panel showing the diff erence in the digital surface model between the two images draped over the shaded relief derived from 2015 DSM.

412

JGI and partners are now engaged in building capacity of village governments to imple-ment their land-use plans, including restoring and managing their Village Forest Reserves. This includes supporting village forest monitors to use Survey 123 (a mobile app to collect spatial data) and mobile technologies to patrol their forests on the ground and interpretation of very high-resolution imagery from Maxar satellites and drones. This improves transparen-cy and establishes a common language and understanding of how local communities enforce their village land use plans.

Figure 4.8-2 shows an area of Kalinzi village forest reserve in GGE acquired from a fixed-wing drone with a RGB camera in 2015 and 2016. Such imagery could be used by local com-munities to visually detect both increase in tree cover and new threats such as conversion of trees to new farms or logging. Permanent and temporary houses could be seen as well as the type of crops used for farming. Visual interpretation could be complemented by change detection algorithms using datasets derived from drone imagery. For example, Figure 4.8-2 shows normalized differences in surface heights detected from 2015 and 2016 Digital Surface Models (DSM) estimated as (DSM2016-DSM2015)/DSM2016+DSM2015). Note that this approach could quickly highlight areas of tree cover loss and gain that could be confirmed by visual interpretation (Pintea, 2016). 4.8.3 Animal detection

A major part of the drone research in conservation has focused on detecting and locating an-imals (or their signs) in their environment and then using these data to inform distribution, density and abundance estimates (Wich & Koh, 2018; Chabot & Bird, 2015; Christie et al., 2016). Recent studies reveal that sensors on drones can be used to detect a large number of species across a broad range of terrestrial and aquatic habitats. In terrestrial habitats examples range from large elephants in open savanna areas (Vermeulen et al., 2013) to small birds in fields (Is-rael & Reinhard, 2017). Studies in aquatic environments include marine mammals (Hodgson et al., 2013; Koski et al., 2015), sea turtles (Rees et al., 2018), sharks (Rieucau et al., 2018; Kiszka et al., 2016), fish in rivers (Groves et al., 2016; Harris et al., 2019) and marine conservation (John-ston, 2019). Drones have been used in a variety of environments from polar regions to the trop-ics and highland plateaus (Su et al., 2018; Duffy et al., 2017; Fiori et al., 2017). Although most work on animal detection has been done using drones equipped with visual spectrum cameras there is a growing number of studies using thermal sensors mounted to drones with the aim of detecting animals when visual spectrum cameras are ineffective (e.g. low-light, camouflage, par-tial cover under vegetation) (Scholten et al., 2019; Spaan et al., 2019; Burke et al., 2019; Rashman et al., 2018; Gonzalez et al., 2016; Kays et al., 2019; Seymour et al., 2017). While these studies

413

overcome limitations of visual spectral data, a challenge of using thermal data is the difficulty in distinguishing species when sizes are similar (Burke et al., 2019b; Kays et al., 2019). Kays et al. (2019) suggest that combining flash photography or IR illumination for RGB images in combi-nation with thermal sensors might reduce such challenges.

Several studies have used drones to obtain data on the distribution and/or density of animals, and some have even used those data to obtain abundance estimates. A recent study of chinstrap penguins ( Pygoscelis antarcticus ) derived the total abundance of 14 colonies from data obtained with a fixed wing drone (Pfeifer et al., 2019). Two other recent studies used fixed wing drones to obtain data on several large wild and domestic herbivores on the Tibetan Plateau and Chang Tang National Nature Reserve in China and estimate their abundance (Guo et al., 2018; Hu et al., 2018). Obtaining density estimates from animals with drones has also been used in marine settings as for example in a study on blacktip reef sharks ( Carcharhinus melanopterus ) where standard visual spectrum cameras have been used to detect sharks in relatively shallow and clear waters (Rieucau et al., 2018).

An important aspect of the work with drones is comparing how similar drone counts are to those obtained with other methods (see Box 3). Studies have investigated this issue with visual spectrum and thermal sensors and for both animal and animal sign counts (Spaan et al., 2019; Burke et al., 2019b; Wich et al., 2016b; Gooday et al., 2018) but the results vary. For instance, fewer orangutan nests were observed with drones equipped with a visual spectrum camera than on the ground, but the ground and aerial counts were correlated (Wich et al., 2016b). A carefully designed study with fake birds indicated that counts on visual spectrum drone images were more accurate than ground counts (Hodgson et al., 2018). Several studies with thermal imaging cameras indicated that counts from the thermal data were comparable with ground data or had higher counts for animals living high up in the forest canopy (Spaan et al., 2019; Burke et al., 2019b; Corcoran et al., 2019), but were fewer in areas where canopy cover was high, for example in the case of New Zealand fur seals ( Arctocephalus forsteri ) (Gooday et al., 2018). Likewise, in one study, fewer primates were observed in a dense forest on the thermal images than during ground counts due to the high canopy cover (Kays et al., 2019). Box 3: Using drones and thermal cameras to count spider monkeys

Spider monkey ( Ateles spp.) populations are declining across their range (Mexico–Bolivia) due to deforestation and hunting but determining population abundance by traditional met-hods is difficult as these species are arboreal, live in closed-canopy forests and have a high de-gree of fission-fusion dynamics (Ramos-Fernández & Wallace, 2008). For this reason, Spaan et al. (2019) assessed the effectiveness of using a drone fitted with a thermal infrared camera

414

(TIR) to survey Geoffroy’s spider monkeys ( A. geoffroyi ) at their sleeping sites in Los Arboles Tulum (20°17’50”N, 87°30’59”W), Mexico. They compared the number of spider monkeys counted by observers on the ground (ground counts) to the number of spider monkeys coun-ted from TIR drone footage (drone counts) using a concordance analysis.

Ground and drone counts were compared for a total of 28 drone flights at three spider monkey sleeping sites. Between sunset and sunrise, the temperature of the environment dif-fered the greatest from the spider monkey skin temperature, presenting the ideal time to fly the drone with the TIR camera. However, due to restrictions in national regulations, flights were performed around sunset and sunrise. The authors performed a combination of both grid and hover flights at 60–70 m above ground level. Grid flights consisted of the drone flying in a grid pattern over the sleeping site and during hover flights the drone hovered above a single sleeping tree for several minutes. A group of observers counted the number of monkeys in a subgroup from the ground as the drone was flying over the same area simul-taneously. The number of monkeys observed in the TIR footage collected by the drone was determined post-flight. Using hand-drawn maps they determined the visual field of ground observers and only compared the number of monkeys observed from the drone TIR footage to the number of monkeys counted on the ground that fell within that area.

The researchers flew a custom-made quadcopter fitted with a TeAx Fusion Zoom dual-vi-sion TIR/RGB camera. The camera was fitted to a gimbal to keep the footage steady dur-ing flights. Grid and hover flights were planned on the Mission Planner software (v1.3.52.0; http://ardupilot.org/planner/)

The authors used Lin’s concordance coefficient to test agreement between the methods (Lin, 1989), where agreement is measured from -1 (no agreement) to 1 (perfect agreement) (McBride, 2005). As individual monkeys are more likely to be missed during ground surveys when subgroups are larger in size (Defler & Pintor, 1985; Chapman et al., 2015), the authors predicted that the two survey methods would show no agreement for large subgroups, i.e. the drone would count more monkeys than ground observers. Contrastingly, when spider monkey subgroups were small, they predicted a high level of agreement between the meth-ods, i.e. the drone and ground counts would not differ. To test the hypotheses, they compared drone and ground counts for small subgroups (≤9 individuals) and large subgroups (≥10 individuals).

When spider monkey subgroups were small, the two methods agreed (rc = 0.90 [95 % CI: 0.79–0.95]), indicating that the drone performed equally well as observers on the ground. However, when spider monkey subgroups were large (i.e. included ten or more individuals), drone counts were higher than ground counts (i.e. no agreement between the methods: rc = 0.08 [95 % CI: -0.38–0.40], Figure 4.8-3).

415

As expected, drone counts were higher than ground counts for large subgroups. Th ey at-tributed this diff erence to the diffi culty of observing primates from the ground as they tend to blend in with their surroundings, whereas the monkeys appeared as clearly distinguishable white objects on the TIR footage. Additionally, TIR footage from drones can be replayed mul-tiple times aft er fl ights have been completed, aiding detection of the animal of interest. Th e authors recommended the use of drones fi tted with thermal cameras for surveying arboreal primates as they can cover larger areas and count monkeys equally well or better than ground observers.

To overcome the short fl ight limitations of quadcopters, future avenues will explore the use of fi xed-wing drones fi tted with thermal cameras to cover larger areas in single fl ights. Research should focus on fl ying over sleeping sites at night, to ensure that the entire group can be counted in a single fl ight or a series of back-to-back fl ights and thereby obtain (near) complete counts of spider monkey groups.

Figure 4.8-3: Bar chart showing the diff erences in the number of monkeys counted from the TIR drone footage and by observers on the ground for small and large subgroups.

Researchers have been using drones fi tted with visual spectrum cameras for a variety of other interesting research and/or conservation questions that go beyond detecting animals. Assessing the health of animals is an important topic for conservation and innovative work shows that visual spectrum images acquired with drones can be used to obtain body-size measurements of Australian fur seals ( Arctocephalus pusillus doriferus ) as indices of their body condition (Allan et

416

al., 2019). In other work researchers have used images obtained from drones to study the social interactions in Barren-ground caribou ( Rangifer tarandus groenlandicus ) (Torney et al., 2018) and blacktip reef sharks ( Carcharhinus melanopterus ) (Rieucau et al., 2018). These studies are indicative of the emergence of new ways to analyse drone data to answer ecological questions.

Researchers have also ventured beyond imaging sensors and are using an array of different sensors for studies that are important for conservation. A very promising, but vastly understud-ied topic is the use of acoustic sensors mounted to drones to detect species through their calls. A consistent challenge for such studies is how to avoid having the noise from the drone influence the recordings of the animals’ calls. One option is to increase the distance from the drone to the microphone by having the microphone (with or without the recorder) attached to a cable or rope below the drone. This approach was used in a study on a number of bird species and compared to ground counts only produced slightly lower species richness estimates and comparable number of birds per point count even though for species with low-frequency songs the drone estimates were lower (Wilson et al., 2017). Another study used a Styrofoam baffle to reduce the noise from the drone and successfully recorded the echolocation calls of the Brazilian free-tailed bat ( Ta- darida brasiliensis ) (Kloepper & Kinniry, 2018; Fu et al., 2018).

An important aspect of many conservation projects is to locate animals and subsequently de-termine their home-range (Thomas et al., 2012). A multitude of methods is being used for this including hand-held radio receivers to locate VHF tags on animals and GPS tags which record GPS locations at predetermined intervals that can be uploaded to phone networks or satellites (Thomas et al., 2012). These methods have their challenges such as the difficulties of locating VHF tags over large and often inaccessible areas or in the case of GPS tags the costs of obtaining data through satellites or the size of GPS tags being too large for the animal of interest. As a result, several studies have investigated using drones to locate VHF tags on animals but most of these are still in experimental phases (Muller et al., 2019; Nguyen et al., 2019; Desrochers et al., 2018; Cliff et al., 2015). A recent study on yellow-eyed penguins ( Megadyptes antipodes ) was able to use a drone to locate VHF tagged penguins faster and with a lower search effort than other methods (on the ground VHF tracking and manual ground searching) (Muller et al., 2019). Al-though this technology will take some time to mature and become widely available, the results are promising and could facilitate conservation efforts tremendously.

Core challenges preventing drones from becoming an efficient tool for conservation are the burdens of data curation and post-processing, activities that rely on skills not traditionally pri-oritized in conservation science curricula. At the moment a large number of the analyses per-formed using data captured by drones are conducted by humans who count the animals manual-ly. This means that the efficiency of data collection gained by using drones is potentially offset by the costs and time needed to manually count animals or other objects of interest. This challenge is not unique to the usage of drones in conservation but applies to other methods of collecting

417

data as well (e.g. camera traps) (Weinstein, 2018). There are two important aspects to count-ing animals: detection and in the case of multiple species, classification (Wich & Koh, 2018). There have been several studies that applied computer vision methods to automate detection of animals or their signs (e.g. nests) with almost all studies focusing on a single species (Wich & Koh, 2018; Weinstein, 2018; Kellenberger et al., 2018a). Particularly promising are machine learning methods that have been applied to both thermal and visual spectrum images (Corcoran et al., 2019; Kellenberger et al., 2018b). In some cases, non machine learning methods that use thresholding and classification methods have been successful as well (Vayssade et al., 2019). De-spite the promise of automated detection and classification of multiple species, more research is needed that incorporates a number of important characteristics such as the sensors used, habitat type, size of the animal, number of species in the area, colour of the animal, and so forth on de-tection and classification accuracies. At the same time there is a need to make machine learning methods more accessible to non-computer scientists, increase access to large training datasets (potentially in collaboration with citizen science projects), and methods to deal with small and unbalanced datasets (Weinstein, 2018; Kellenberger et al., 2018b). Collaborations between com-puter scientists, ecologists and large companies that can provide the required computing power are increasingly becoming important to achieve these goals. Universities could facilitate this as well by incorporating machine learning into the curriculum for conservation science students.

It is important to discuss the potential disturbance that drones can cause animals and how to minimize potential sources of disturbance (Mulero-Pázmány et al., 2017; Hodgson & Koh, 2016). There have been several studies conducted on disturbance to animals caused by drones (reviews in Wich & Koh, 2018; Mulero-Pázmány et al., 2017) and this research area continues to grow (e.g. Brunton et al., 2019; Bennitt et al., 2019). These studies indicate that disturbance ranges from being absent (at least in terms of an observable change in the behaviour of an ani-mal) to leading to strong behavioural reactions by animals such as flying away and alarm calling (see Table 6.2 in Wich & Koh (2018). The review by Mulero-Pázmány et al. (2017) shows that reactions depend on aspects related to the animals themselves (species, life-history, breeding or non-breeding, and aggregation level) and characteristics of the drone (multirotor, fixed wing, powering system) and flight pattern (grid flight or flight that specifically approaches the animal). Specifically, flights with larger drones and those that are powered by fuel instead of batteries lead to stronger reactions by wildlife. In addition, birds are generally more likely to be disturbed by drones than other taxa (Mulero-Pázmány et al., 2017). As a result of these studies, researchers started to develop guidelines to minimize disturbance (Wich & Koh, 2018; Hodgson & Koh, 2016). It is important to compare the potential disturbance of drones to the potential of alterna-tive survey methods (Wich & Koh, 2018). A set of two observers walking a trail in a rainforest to count non-habituated primates will often also lead to disturbance (Schaik et al., 1983), a manned aircraft flying over penguins can lead to pronounced disturbance even at large distances (Wilson

418

et al., 1991) and even camera traps have been found to have some influence on animals (Meek et al., 2014, 2016). There is therefore a strong need to conduct comparative studies in which several survey methods are evaluated in terms of the disturbance they cause to animals (e.g. Scholten et al., 2019). These studies should, ideally, go beyond the visually observable behaviour of animals and incorporate physiological measures of stress as well. To date, almost no research has been conducted on this, except for the measurement of physiological responses to drones in bears (Ditmer et al., 2015). 4.8.4 Poaching

Poaching is a core issue for animal conservation and generally considered one of the two main threats facing wildlife (the other being land-cover change) (Benítez-López et al., 2017, 2019; Fa & Brown, 2009; Wich & Marshall, 2016). Given the limited resources available to protected area managers for deploying anti-poaching missions and the risks involved with such missions (Olivares-Mendez et al., 2013), there is an interest in determining whether drones could support anti-poaching missions (review in Wich & Koh (2018)). The most important aspect of such mis-sions is to detect poachers before they reach the target animal(s) and several organizations have been deploying drones for such efforts (e.g. WWF, Air Shepherd10). It is difficult to determine the success of these efforts as understandably few details of such operations are provided by the organizations involved. It is known, however, that poachers have been detected and that on at least one occasion this has led to poachers being intercepted11. Despite this, little has been pub-lished on how often poachers might have been missed during operations and which factors are important for detection. An experimental study in which poachers were mimicked by students and research staff in Tanzania showed that a thermal sensor led to a higher detection probability than a visual spectrum sensor at dawn and dusk, but that a higher canopy density and larger dis-tance from the line of flight led to decreased detection probability on thermal images (Hambre-cht et al., 2019). The study also found that image analysts differed in their detection probability and suggested that machine learning might solve that issue. As with the detection of animals a future avenue is to automate the detection of poachers through machine learning. Some promis-ing work on automating the detection using machine learning is being conducted and once fully operational in the field should facilitate the use of drones in anti-poaching missions (Bondi et al., 2018; Fang et al., 2019). For anti-poaching missions, near-real-time detection must be achieved. At least three options are interesting to explore. First, detection on the drone and in case of a detection an alert being send to a ranger indicating location and object. Second, detection on live images at a remote server through a GSM network (Bondi et al., 2018). Third, detection on live images on a local computer in the field.

While drones might have many benefits, it is important to evaluate the social impacts that the use of drones for anti-poaching or other conservation efforts might have (Wich & Koh, 2018; Sandbrook, 2015; Humle et al., 2014; Wich et al., 2016c; Nowlin et al., 2019). The use of drones raises a set of questions surrounding data security, privacy, safety, negative implications for local communities, and so forth that require more discussion (Sandbrook, 2015; Nowlin et al., 2019). It is relevant to mention that these do not necessarily only apply to drones but might also apply to other remote sensors being used for conservation efforts such as camera traps, satellites, and acoustic sensors (Wich et al., 2016c). Technology such as drones can, however, be used by local communities to map their lands which can be part of effort from local communities to resist dis-possession of their lands (Millner, 2020; Radjawali et al., 2017). Although not widespread, stud-ies conducted in Guatemala and Indonesia show that drones can be used by local communities to map their lands and use such maps to counter land use plans from the government (Millner, 2020; Radjawali et al., 2017). 4.8.5 Discussion

The past decade or so has seen tremendous progress in the use of drones for conservation re-search (Wich & Koh, 2018). This progress has been facilitated by the growth and affordability of the consumer drone market, the development of drones for industries such as agriculture and mining, and the large open-source community for software (e.g. Mission Planner12) and hard-ware (e.g. Pixhawk13) that allowed for bespoke drone development for specific purposes. As a result, drones are now used in three important aspects of conservation: land-cover classification and change detection, animal counts, and to assess human behaviour (e.g. poacher detection). For the latter social and privacy implications need to be considered carefully (Sandbrook, 2015; Humle et al., 2014; Nowlin et al., 2019) and eventhough there has been a proliferation of drone use for conservation it seems much of this is still in a research phase, and that as far as we are aware drones rarely form part of the standard operations of day-to-day conservation area man-agement. In the case of anti-poaching usage of organizations such as Air Shepherd14, there are sustained operations in a small number of areas, but such operations are not yet widespread nor are drones being used often operationally in conservation area management for mapping and animal counting. This might partially be due to the complexity of using drones and analysing the data as well as integrating the results with real-time decision making in the case of poaching. Thus, the usage of drones for conservation is still in its infancy and it will likely take several years before this technology will have matured sufficiently for sustained operational usage in a large number of conservation management and research settings.

There are likely several reasons why the uptake of the technology at a large scale has not happened yet. First, despite the relative ease with which data can be collected in certain circum-stances the analyses of the data, particularly for animal counts is still largely manual, thereby increasing the costs and potentially preventing drones from being more cost efficient than other data collection methods. There is thus a strong need for (semi) automated methods. Second, in many countries the regulations prevent or make it quite complicated to fly beyond the visual line of sight. Because this distance is ~500 m from the pilot, it restricts the use of drones to quite small areas. Whilst regulations are crucial for the safe operation of drones it is worth evaluating a risk-oriented approach in which such distances can be increased in areas of low risk to other air users, people, and property. These low-risk areas are often where conservationists would like to use drones: remote national parks, marine reserves, and so forth. Third, even though many drone systems, in particular multirotor systems, have become very user friendly there is still a hurdle for the adoption of the technology by those who have had less opportunity to use tech-nology (Paneque-Gálvez et al., 2014) and even with the consumer grade drones accidents can happen due to insufficient experience (Semel et al., 2019). This hampers uptake and bespoke, sometimes costly, training is required to overcome this at the short term (Radjawali et al., 2017; Paneque-Gálvez et al., 2014, 2017). Fourth, drone usage often relies on access to internet for up-dates in firmware, ground control software, downloading base maps, and potentially uploading data to cloud servers for analyses. In many areas where conservationists would like to operate drones there is no mobile network nor offices with wifi connection from satellites which can make operations more cumbersome (Paneque-Gálvez et al., 2014, 2017). Fifth, drone repair opportunities are often limited. Consumer systems repairs are often difficult due to propriety systems and lack of availability of spare parts. In contrast, bespoke systems can often be repaired in the field given that spare parts and a drone engineer area are available, however this rarely is the case (Paneque-Gálvez et al., 2014). Sixth, the durability of drones is limited and often not well quantified (Paneque-Gálvez et al., 2014). It is for instance in most cases not known after how many hours of use a motor should be repaired because durability testing with most parts of a drone either have not been conducted or have not been made available by companies. The challenges outlined above are by no means meant to be an exhaustive list, but just some that we have encountered in our work and have found in the literature. In addition to these the location

421

one aims to operate in can pose some additional challenges such as cold temperatures influenc-ing flying time when using LiPo batteries, sand and corrosion when flying in coastal or marine settings can wear drone parts (Duffy et al., 2017).

Even though drones offer rich data acquisition opportunities, combining drones with other technologies can lead to the integration of various types of data (e.g. acoustic, visual, vibration) and/or similar data collected using multiple data sensors (e.g. visual spectrum images from a camera on a drone and from camera traps on the ground) (Wich & Koh, 2018). Drones could be used for both data acquisition and data transmission in this type of sensor network. Exciting steps in this direction are being taken by initiatives such as Smart Parks who have been operating in several National Parks in Africa and track animals through long range (LoRa) networks15. Given the variety of sensors to detect poaching events or poachers for instance, there is a wealth of opportunity to link sensors together to achieve a better anti-poaching system (Kamminga et al., 2018). Such integration of sensors is also becoming more common in ecological research and combined with the rapid developments in machine learning will likely lead to very exciting new research and conservation approaches during the coming decade (Allan et al., 2018).

References for further reading

422

Abbreviations

424

References

Wir haben uns bemüht, sämtliche Bildrechteinhaber ausfindig zu machen. Sollte dies nicht in allen Fällen gelungen sein und sollten Bildrechte geltend gemacht werden können, bitten wir um Kontaktaufnahme. Die Nutzungsrechte werden in diesen Fällen nach branchenüblichen Sätzen vergütet.

All reasonable attempts have been made to trace and contact the copyright holders of all images. However, this might not have been possible in each and every case. You are invited to contact us if your image was used without identification or acknowledgment. A compensation will be in accordance with standard market terms.