The selected AGINFRA+ Use Cases will illustrate the benefits of applying the Science as a Service approach to pressing research questions from the corresponding research communities. The pilot trials will showcase the benefits of using the evolved AGINFRA components and services in a range of experiments, specifically demonstrating its generality and performance.
Use Case #1: Agro-climatic & Economic Modelling
Responsible Partner: ALTERRA
The case of a global agricultural modeling, intercomparison, and improvement research community that studies short and long term food production under environmental and climate change conditions. In this case, the problem addressed is related to accessing, combining, processing and storing high volume, heterogeneous data related to agriculture/food production projections under different climate change scenarios, so that it becomes possible to assess food security, food safety and climate change impacts in an integrated manner, by a diverse research community of agricultural, climate and economic scientists.
The mission of this research community lies in improving historical analysis and short and long term forecasts of agricultural production and its effects on food production and economy under dynamic and multi-variable climate change conditions, aggregating extremely large and heterogeneous observations and dynamic streams of agricultural, economical, ecophysiological, and weather data.
Bringing together researchers working on these problems from various perspectives (crop production and farm management methods, climate change monitoring, economic production models, food safety models), and accelerate user-driven innovation is a major challenge. The new AGINFRA services will enable executable workflows for ecophysiological model intercomparisons driven by historical climate conditions using site-specific data on soils, management, socioeconomic drivers, and crop responses to climate. These intercomparisons are the basis for the future climate impact and adaptation scenarios: instead of relying on single model outputs, model-based uncertainties will be quantified using multi-model ensembles. The close interactions and the linkages between disciplinary models and scenarios, including climate, ecophysiology and socio-economics will allow researchers to prioritize necessary model improvements across the model spectrum.
Multi-model, multi-crop and multi-location simulation ensembles will be linked to multi-climate scenarios to perform consistent simulations of future climate change effects on local, regional, national, and global food production, food security, and poverty.
The data sources that can feed such bases required for this work are developed by different community members, are processed using different systems, and are shared among the community members. This creates several challenges that are connected to multiple factors: different platforms, diverse data management activities, distributed data processing and storage, heterogeneous data exchange, etc. and distributed model runs, data storage, scenario analysis, and visualization activities that take place. Thus, AGINFRA+ will also develop a reactive intensive data analysis layer over existing federations that will help the discovery, reuse and exploitation of heterogeneous data sources created in isolation, in very different and unforeseen ways in the rest of the communities’ systems.
Use Case #2: Food Safety Risk Assessment
Responsible Partners: BfR
In the context of the Food Safety Risk Assessment pilot, the AGINFRA+ project will assess the usefulness of AGINFRA components and APIs to support data-intensive applications powered by the FoodRisk-Labs suite of software tools. This includes the extension of FoodRisk-Labs’ capabilities to handle large-scale datasets, to visualize complex data, mathematical models as well as simulation results and to deploy generated data processing workflows as web-based services.
More specifically, FoodRisk-Labs will be extended such that it can use and access AGINFRA Services. This will allow to design new high performance food safety data processing workflows facilitating e.g. the efficient extraction of data and models from the rich corpus of scientific literature. Another workflow will address the issue of generation of easy-to-maintain open food safety model repositories (openFSMR), which exploit AGINFRA Ontological Engineering and Open Science Publishing Components. Mathematical models published in community driven openFSMR will then be used for large-scale quantitative microbial risk assessment (QMRA) simulations. These simulations incorporate predictive microbial models, models on food processing and transportation, dose response models as well as consumer behaviour models. AGINFRA components supporting the execution of computational intensive simulations as well as those helping to present simulation results will be applied here. Finally, preconfigured QMRA models will be deployed as easy-to-use web services using specialised AGINFRA components.
Use Case #3: Food Security
Responsible Partner: INRA
The new AGINFRA infrastructure will be used to alleviate the big data challenges pertaining to specific problems on food security, namely the need to efficiently analyse data produced by plant phenotyping and its correlation with crop yield, resource usage, local climates etc. It will particularly explore how high throughput phenotyping can be supported in order to:
- Determine adaptation and tolerance to climate changes, a high priority is to design high-yielding varieties adapted to contrasting environmental conditions including those related to climate change and new agricultural management. It requires identifying of allelic variants with favourable traits amongst the thousands of varieties and natural accessions existing in genebanks. Genotyping (i.e. densely characterizing the genome of breeding lines with markers) has been industrialized and can now be performed at affordable cost and be able to link and analyse phenotyping and genotyping data is strategic for agriculture.
- Optimize use of natural resources. High throughput plant phenotyping aims to study plant growth and development from the dynamic interactions between the genetic background and the environment which plants develop (soil, water, climate, etc.). These interactions determine plant performance and productivity can be used in order to optimize and preserve natural resources.
- Maximize crop performance, gathering and analysing data from high throughput plant phenotyping allows a better knowledge of plants and their behaviour in specific resource conditions such as soil conditions and new climates.
- The tasks generally require intensive big data analysis as they present all the challenges associated with big data (Volume, Velocity, Variety, Validity).
Each high throughput plant phenotyping produced several Tbytes of very heterogeneous data (sensor monitoring, images, spectrums). Data are produced at high frequencies and hundreds of thousands of images can be gathered and be analysed each day. Such volumes require automatic data validation tools. One of the major challenges of plant phenotyping is the Semantic Interoperability.
This AGINFRA+ pilot will assess the effectiveness of the proposed framework in data intensive experimental sessions, where the distinct processing steps operate over different datasets, require synchronization of results from various partial computations, and use very large and/or streaming raw or processed data.