This year, I presented a session titled "Can Engineers Save the Planet?" in Codemotion'23 and JOTB'23. Since it has received a lot of interest, I'm summarizing it in a blog post for any engineer out there interested in saving the planet too.
I've also written two previous blog entries about the experience of preparing a session for such important events. If you are interested in that, check my previous entry titled Codemotion & JOnTheBeach: Can Engineers Save The Planet Talk - Part 1.
Next, I also explained how much fun it was to deliver the talk at those big events. Don’t hesitate to read about my experience in part 2 of the blog series and also part 3.
Probably you have to start asking yourself why are you reading this. And probably it is because you, as an engineer, developer, manager, tester, or individual, care about the planet, and you want to learn how to save it. Probably you are interested in not harming it so much.
This is probably because you wanted to become an IT professional to create a better world, not only to create video games or make money. And that is exactly my case. If you are in those shoes, please then continue reading.
There is no planet B. I know Elon Musk has his own idea about how to solve this problem, it is just moving to Mars when the Earth is not enough. But I have some concerns about it.
Just imagine for one second that we have Planet B, probably there won’t be enough resources to save all the people and the animals on the planet.
But imagine we have the resources to save all the people and the 8.7 million species of plants and animals in existence. How we can EXACTLY recreate the planet as it is Today? Or even better, a new version totally clear, with no pollution.
The answer to the previous question is we can’t, since our blue planet is not a docker image we can just run in a container, so if the container is corrupted, it is as easy as to deploy the image in a new one.
There are some problems we need to solve as soon as possible, like climate change, global warming, pollution, deforestation, and many others that are, sadly familiar to us.
I’d like to show you some numbers to help you think about how big is our problem. And the problem can be bigger if we don’t take action right now.
2020 was the hottest year since we have data. 1,7 more degrees as the average increment for the world, and we’re expecting to reach 2 degrees by 2040.
In 2021, the global sea level set a new record high of 97mm above the sea levels.
Also in 2021, the total frequency of natural disasters was 13% higher. Especially flood disasters. Every day up to 150 species may go extinct.
There are so many causes, but I’d like to talk about our footprints because we’re leaving footprints throughout our lives. Every single action we take in our lives has a consequence for the environment. So we need to optimize how we use and consume resources.
But we're living in a digital world, so it is not only about the carbon footprint. We’re also leaving digital ones. Since in the digital world, everything is recorded.
Now I’d like to continue talking about inefficiencies because our traditional approaches are full of inefficiencies. But remember, we live in a digital world, our data is being collected, but sometimes don't do anything with it. Or they are not used in order to let us know what we’re doing wrong. We are creating inefficiencies also in that digital world.
It is like recycling our garbage but focusing on data instead. We need to start recycling the data we’re collecting. But how do we do that when you even don’t know how to recycle the physical garbage?
So if we know how to use the data properly, with more efficiency, that usage of the data will lead us to needing fewer resources, and it will harm the planet less.
Is easy to be conscious of the carbon footprints, some of them are visible like garbage.
But what about the digital footprint? It is not easy since we can’t see or touch it, but it already exists, and in the digital world, digital footprints are everywhere.
But what is exactly the digital footprint? Let's define it because it is the core of the proposed solution in this article.
The Digital Footprints is just a collection of data based on Digital Events. We can name it as the Internet of Events, making the next classification:
Content: Websites like Wikipedia or browsers generate tons of data based on our searches.
People: We are connected to each other due to social networks, and we are creating events.
Things: Manufacturers are connecting every day more devices to the web, like TVs, and they are generating events as well.
Places: Just think about our mobile devices, they are event-generation machines.
Check the table above to see how the data is being collected. This is a small example and totally invented about the textile and fashion industry. One of the most contaminant industries around the world.
And because we’re buyers, they have processes to manufacture the clothes we want to buy, no matter if we really need them or not.
So, we can say then, here every single row is part of our digital footprint. Because of that, we’re buyers and someone needs to manufacture the clothes we buy. Indirectly this is part of our digital footprint. This belongs to us.
And here we can see events, and also at what time it was fired.
But this is a very small example, imagine a huge amount of data, that is coming from different sources, with different structures, like a Relational Database, a Non-Relational One, CSV files, or just logs.
How we can process tons of digital footprints, in a way that it can help us to become more efficient?
Process Mining can be a potential solution. Because just knowing how to use our data, we could reveal where processes are not 100% efficient and optimize them to reduce their emissions.
Do you remember what I explained about how to recycle the garbage? Process Mining is like having a personal assistant who is telling you if you are doing it good or badly, and even if it is proposing solutions in real-time.
But how it can be done? Let me start with a brief introduction to Process Mining.
Process Mining can be explained as the bridge between process model and data model analysis. It has a strong connection with Data Mining, but I like to think of Process Mining as a link between those concepts, processes, and data.
And why? Because Data Mining is not process-centric and does not focus on events. In the other hand, in Process Mining, we’re assuming the events refer to instances and activities. And events, are data as well.
Remember the previous example about the event log. Now notice how a Case ID appears. This is important because the activities are steps in a process, and the case ID is the link between those different activities.
Then what is an event in Process Mining? An event is the combination of the case + activity + timestamp. We don’t really care about other columns, like for example the notes. And remember, the event is our digital footprint.
And that is the missing link! That combination of case ID, activity, and timestamp is an identifier for our digital footprints. We can say that is our formula. And we can use it to discover, diagnose, and fix our processes to become more efficient.
But that is okay, we know we need to collect the data, and we know even how to collect that data, but now the question is “How we can use that data?”
Petri nets and Reachability Graphs are, basically, foundational for process modeling. They are used to check how many events are driven by our logs.
A Petri net contains a set of tokens, transitions, and markings. From that, we can build a reachability graph, and from that, we can express behaviors.
The idea is to build the activities' sequence for every single case, create a Petri net from scratch covering all our cases (maybe millions) and this can be easily built using the events logs. This technique is what we call Play In.
Play-in is the way to create a process model based on events, and by collecting all the reachability graphs from our Petri net, we can achieve the goal of creating a process model.
But what if you already have created a Petri net, but you want to recreate the event log? It is possible. Thinking in terms of, creating an event log, by playing out the model, the event log can be recreated.
The basic idea is, we start from a model, and from that model, we can generate behaviors. And of course, we can recreate our digital footprint. Even training our model to create corner situations.
So that is the point, conformance checking is the ability to generate diagnostics based on the fit between the reality (or not) and the model.
This is the final technique, this technique is to improve a model based on event data. Using that data to repair the model.
How to do that? Is as easy as comparing the event log (remember our digital footprint) to the process model. And check if there is any kind of deviation or any degradation of the performance.
The idea is to replay the model but with the purpose of repairing that model, improving or extending it, Which will create an improved new version of the model.
We can see in the previous diagram, the three techniques mentioned in this article.
We’re discovering process models based on data. From that data, we’re creating a visualization of the process (playing in) also we’re applying diagnostics (play out) to finally, repair the process looking for deviations to enrich the process. That is enhancement
I’d like to introduce how Process Mining can help us but with a very small example, about what we could do if we could track all our daily activities, and later put that data into Celoni’s Process Mining tool.
Let’s suppose this is what your life (probably) looks like. You wake up early in the morning, then you have your breakfast, you go to work, you eat something, you have some relaxing time, and, finally sleep. All looks in a perfect sequence of activities, with no deviation and no distractions.
In that scenario, do you wake up every morning at the same time, like a robot? Do you take the same means of transport everyday and they are always on time?
The reality is that these sequence of activities only shows the happy path that follows the perfect order. But does it represent your reality?
Because life is complex and complicated, and that is true for any routine. In fact, the “routine” as a term should not exist. As a father of two small kids, I’m chasing the perfect routine to make my kiddos go to bed hopefully soon, and every day I’m facing a lot of deviations from my happy path, the most typical is “Daddy, I don’t wanna go to bed”
Deviations are not always large changes. Just not brushing your teeth long enough, or just using too much toothpaste are also deviations from the happy path. All those are, indeed, deviations from the happy path, and all those deviations, no matter how big or small are, affect your life, creating a graph like you can see above. Totally chaos.
So ask yourself if you are really sustainable.
Feeding a Process Mining tool with the data which is coming from the previous graph, will help you to have a visual representation of those deviations in a more user-friendly way.
Then having the numbers at your hand, you have true transparency of how your actions are harming the environment, because you have a single source of truth in place.
You can start asking questions and solving problems, automating what we care about most. Let’s do a tiny example. Having that information at your hand, then you can automate actions if your water consumption is going crazy.
The previous example is a small example of how process mining can help you to reduce your footprints. But imagine how this technology can help other companies to reduce their emissions.
I’d like to present some stories about how companies from different industries, are reducing their emission since they are using Process Mining for different purposes, but especially to reduce their emissions.
A customer from the Chemical Industry embraced the change of using Process Mining, and they reduced their carbon emissions by up to 6%. How does Process Mining help them?
Having a single source of truth based on the integration of different data sources in just one.
Having that source of information, now they are able to detect and quantify in real-time, key emission drivers, like direct emissions, energy, and water consumption.
Finally, they achieved the goal of having transparency, on opportunities to reduce emissions with automation and persona-focused recommendations
Talking about robotics and automation companies, there is one customer that before starting to use Process Mining, had some problems to address:
Shipping emissions over their supply chain were not measured
Not being able to detect root causes, so leads to not being able to balance costs vs emissions, for optimal shipment planning
And a lack of emissions transparency for customers
We helped them to conduct a proof of concept, based on Process Mining, to tackle these issues in three (baby) steps
Quantify their shipping emissions
Create automatized actions to detect root causes for delays in the supply chain
Build automatic decision-making to optimize the process
The result is more than 139k shipments with quantified emissions, 16M kgCO2e Outbound shipping emissions, and finally >8%CO2e reduction potential identified.
So we have the tools, the knowledge, and the skills. We understand the problem, and we know the solution. We need to become efficient. Being efficient in our day-to-day activities is crucial. At work or when choosing the shortest path to your destiny. And for sure, efficiency is something that the planet will appreciate.
As we face ever more complex challenges, from climate change to global pandemics, we need engineering more than ever.
And by embracing process mining, engineers can lead the solutions that will shape our future. Let’s embrace this powerful tool, and work together to create a better world for ourselves and for generations to come.
Because Earth is our future, and it is our only home. Let’s work together to make it a better planet so that we don’t need to move ourselves to Mars.
If you want to know more, I recommend you to check those talks on your own, since they are recorded, in Spanish and English, since they were the starting point for this post, and you could find more insights about Process Mining and Sustainability.