Making Australia’s unique soil and marine microbiome data more accessible for all

This month work started on the Australian Environmental Microbiome Research Data Cloud project which will link significant national biological datasets with national biosciences computational infrastructure and services funded through the National Collaborative Research Infrastructure Strategy.

These datasets map the Australian environmental microbial terrestrial and marine environments and result from over $10m of effort spent developing them through the Biomes of Australian Soil Environments (BASE) consortium, and the Marine Microbes (MM) consortium respectively.

Understanding the combined genetic material of the microorganisms in their environment is of broad national and international utility for academic researchers, industry and government agencies, but at the moment it is only accessible to skilled bioinformaticians and this limits its potential.

Experts from the Centre for Comparative Genomics (CCG) at Murdoch University, Queensland Cyber Infrastructure Foundation (QCIF), Melbourne Bioinformatics at the University of Melbourne, the Atlas of Living Australia, CSIRO and the EMBL Australia Bioinformatics Resource (EMBL-ABR) will deploy a new cloud-based analysis system to make the data easier to find, analyse and interpret.

With biological science fast becoming a data science, and data growing at an exponential rate, Australian bioscience capability efforts, like those the world over, are increasingly focussed on making nationally unique datasets findable, accessible, inter-operative and reproducible (ie. curated according to FAIR principles).

Microbes are fundamental to human and ecosystem health. They mediate biogeochemical and nutrient cycling, are integral to the productions of crops, livestock and textiles, are responsible for disease and its treatment, provide clean drinking water and the means to mitigate waste/pollution.

The BASE and MM consortia have generated the raw data and built protocols and informatics pipelines to interrogate it.

Through this project it will be easier for biologists, not just skilled bioinformaticians, to interrogate the data using existing technology platforms like the Australian-made Genomics Virtual Laboratory. And via the BPA Data Repository, experts will ensure that throughout any research project lifecycle, the data is continually updated to permanent international metagenomics repositories.

Bioinformaticians take many years to develop their expertise. Even with easier to access data, training needs to be a focus of the project. So new training materials to accompany the data analysis system will be made freely available for re-use (to researchers from universities, industry or government) via the EMBL-ABR and ECOEd training portals.

A number of research community representatives will oversee the project, which is building upon past research infrastructure work by experts at ANDS/RDS/Nectar, BPA, CCG, Melbourne Bioinformatics, the University of Melbourne, QCIF, QFAB, Intersect and the Research Computing Centre at University of Queensland.

The project is due for completion on 31 December 2018.


Project updates will be posted on the project blog at:

Project Coordinator at host institution: Jeff Christiansen,, Queensland Cyber Infrastructure Foundation Ltd (QCIF), Axon Building 47, University of Queensland, St Lucia, Brisbane QLD

More Information

Helen van de Pol

+61 448 920 235