7 - Automation

Image representing automation
Automating repetitive data tasks can reduce errors.

Can you automate any repetitive tasks?

Often, computer tasks that need to be done over and over again by a human can be opportunities for error to sneak in. Setting up an automated way of running repetitive (and boring) tasks can eliminate this issue. Anything from a MS Excel formula or macro to coding in a data science frameword can help.

First steps

Let’s think about the repetitive tasks that you could automate - do you always rename files the same way? Do you manually copy files across?

Ways you can automate things:

Intermediate

From survey tools to programming languages, there are several ways to automate beyond macros, and spreadsheet formulas. If you are reliant on a survey tool to conduct qualitative research, try and select a survey tool with a lot of functionality. For example, some survey tools allow the survey creator/administrator to request demographic responses, which then informs how a survey particpants is guided through the survey with specific questionnaires for each predefined demographic.

Alternatively, could you code up your work so its completely automated?

  • Learn to code in Python or R - Talk to your local hacky hour or Software Carpentry people.

Developing code requires an initial outlay of effort, but you will reap rewards in future time saved.

  • Data Cleaning
    • Code can be used to automatically process data, impute corrections, or format data into long/wide structures.
  • Data Analysis and Visualisation
    • For a qualitative reseearcher, sentiment analysis algorithms can be coded in Python to increase efficiency and capacity of analysed text. The results can then be visualised using graphical packages from Python or R.

Create and Capture

Collecting qualitative data is commonplace across many research projects. As such, having a robust automated survey solution can improve research workflows and the quality of data being collected.

Griffith University maintains two official survey tools:

  • REDCap (Research Electronic Data Capture)
  • LimeSurvey

Each solution has different strengths and its own use cases. To learn more about these tools, visit this Griffith site: https://www.griffith.edu.au/library/research-publishing/working-with-data/create-and-capture

There are also two repositories for self-paced learning:

  • REDCap - https://griffithunilibrary.github.io/redcap-intro/
  • LimeSurvey - https://griffithunilibrary.github.io/limesurvey/
Advanced

In 2022, the National Academies of Sciences, Engineering, and Medicine published a report titled Automated Research Workflows for Accelerated Discovery: Closing the Knowledge Discovery Loop.

The National Academies Press defines Automated Research Workflows as: (NAS, 2022, p. 22)

  • ARWs integrate computation, laboratory automation, and tools from artifical intelligence in the performance of tasks that make up the research process such as designing experiements, observations, and simulations; collecting and anlaysing data; amd learning from the results to inform further experiments, observations, and simulations.

Artificial intelligence and machine learning (AI/ML) are beginning to be used more and more as fundamental drivers of ARW capabilities. The various sub-disciplines of AI/ML can support various stages of ARW design. Such as, text analysis for literature reviews; quantitative data science for analysis, computation, and visualisation; and learning algorithms for future experimental design.

Workflows projects and programs support distributed computing, scalable data processing, or querying large and distributed datasets. Interactive notebooks such as Jupyter/RStudio have a place to support human-computer interaction as steps within steps (chunks) can be analysed for performance, tweaked, and improved.

An example using climate science:

Distributed computing and its scalability has allowed for simulation of high resolution climate models. ML is used to analyse micro-scale weather interactions both in a simulated environment and the real world. Data from space, ground, and ocean sensors can be integrated through various pipelines to one location to be used as a sole source for computaiton, simulation and analysis.

Please see the image below for a graphic representation of this ARW concept in practice.

Climate science ARWn
The 'learning and observing' feedback loops associated with this Automated Research Workflow.

If you identify the need for an Automated Research Workflow in your project, contact eResearch Services at Griffith University to start the conversation at: https://www.griffith.edu.au/eresearch-services