It's been said that there are only two types of companies: those that have been shacked and those that are yet to be. Today will be (unfortunately) about the former for the latter.
I'll admit to you that this is one of the more difficult articles I've faced on the X-Coding IT Studio blog. On one hand I know that I prepared the material that can help really many companies. On the other hand, the same material was created as our recovery plan, which is a reaction to a successful hacking attack. I don't think I need (?) to add who was the victim.
Despite the personal internal conflict, I decided to share this unpleasant case study with the world for two reasons:
- "We work with teamwork, integrity and openness" - the second of the company's five values. When we wrote them down, we promised ourselves never to sweep things under the rug. Sure, we could try it once, after all, such things happen every day, but what next? And when are we supposed to pass our exam, if not in this very situation?
- We try to be an anti-fragile organization, that is, to react to a problem in such a way that it will never occur again. This time we spent 300% of our energy on this repair. By sharing this knowledge, I hope that someone will benefit and not have to learn from their mistakes.
What actually happened
Let me tell you what a hacking attack might look like using our example. In this case, the statement "drama in three acts" is the best term I came up with.
Finale. Let's start with the most recent events. When the attack was over, it turned out that we really didn't need much, and we would get rid of all the data from several servers. We can talk about a lot of luck, because the process of data encryption would have been going on at its best and finished if only there was enough disk space. If you are not familiar with ransomware type attacks, they are based on the fact that a special script encrypts all the data and at the end it deletes itself, so there is no way to reverse the process (unless you guess the key, but this is practically impossible).
Why should someone encrypt data? Because often their recovery is priceless, and that's basically the "business model" of those responsible for such activity.
Fortunately, the script didn't end on any of the servers it was running on, so "cleaning up" the whole mess was admittedly time-consuming, but fully doable. In fact, we got most of the operational capability in the same day, and full after two.
I particularly note here the beginning of the account. "Turned out" and "pretty lucky" are phrases that reflect very well our inertia in the face of this whole situation.
A few hours earlier. 4 servers (two of ours and two external) start behaving strangely. Everything is lagging. One project stops working. A moment later another one. It turns out that in the first one the database disappeared.
That was the unpleasant moment when we realized three things:
- we're in the middle of an attack and we don't know what's going on,
- and given the way it was going, it wasn't an automated attack,
- the dispersion (different servers) suggests that the whole thing must be related directly to us.
It's worth expanding a bit on the last point here. Given that not all the servers were our internal ones, we were sure that the attack had to be related to us - either the attack on our server allowed the attack on another server directly, or the knowledge of another server was the inspiration for the hacking attempt. To blame it on coincidence would be naive.
2 weeks earlier. Post-hack analysis (which I'll talk about a bit later) showed that the whole event at our place started about two weeks earlier. The verdict (which is still ringing in my ears now) - one of our internal systems was leaked.
In X-Coding IT Studio, as in any other company, difficulties happen. This situation suddenly gained the rank of at least TOP3, and we're talking about a 10-year-old company here.
Now I have an exercise for you - it doesn't matter if you are an agency like us, or an online store, or yet another business. Do yourself a quick examination of conscience, how dangerous would it be if someone gets access to your system (e.g. CRM)? For example, how much access information is there?
I have a feeling that we passed this exam surprisingly well, because in the end we have suspicion only about one breach of security in this way. But believe me, my first thought was much more grim. If I were to visualize it, it was a bit like when a thief stole your car keys and only managed to take your radio.
Why did it happen?
So much for mileage.
It's hard for me to count how many hours we spent on verifying all the logs on the server (ourselves and with the support of two companies that do this professionally). There were two goals: to learn the course of events and to understand where we went wrong.
I told you about the first one. The error was also found. At one point three things occurred:
- application password in a place on the server where it should not be (or place should not be available),
- the application was vulnerable to a backdoor,
- vulnerability of the environment to the backdoor.
Two of the three reasons are us.
I ask you to read those three points again. Now try to evaluate the issue objectively - is this a very large caliber?
In my opinion, no. Don't get me wrong, this is not an attempt to whitewash myself. But let's agree that it wasn't an error like "database dump available in public directory". (although we have historically found such errors as well). Of course, the effect is more important than the reason and in this sense it doesn't matter why the server wasn't up to date and why the password was there. But note that you don't need big failures at all, even a proper configuration of a few small ones is enough to make a big mistake.
What did we do about it?
Up to this point, the article was purely destructive in tone. If you've been stressed, that means I've achieved my goal, because chances are the following action plan will make more than one agency truly safe.
1. informational responsibility
I purposely don't use the word "responsibility" because I want to avoid the connotation that we did something out of "obligation." We launched three information channels at the same time:
- Clients - we talked to every client, whether they had a right or a wrong to be wronged, and communicated in great detail how things were going. Protecting our customers is our biggest responsibility.
- RODO - every leak is some personal information. Not wanting to judge their vulnerability ourselves, we reported the matter to RODO straight away.
- The police - whatever we say, we must remember that a crime has occurred. And although we do not see much chance, the matter has been reported to the law enforcement authorities.
If this situation happens to you, be sure to remember all three. The strategy of avoiding talking about or even covering up the matter has a very short shelf life.
The Big Cleanup
In fact, in parallel to the conversations, we launched a process of first changing and then cleaning up all the sensitive accesses that could have fallen into the wrong hands. Specifically, we had to in a very short period of time:
- Find all accesses that someone might have once left in our system,
- Delete the records,
- Change all access data,
- Remove public access to links that were found on the system (e.g. Google Docs).
You can imagine how many systems were affected and how quickly the action must have materialized. I will point out one more important thing, which may not be visible at first glance. Well, everything would have taken a lot less stress if only we had kept an eye on passwords. It's not 100 percent feasible in practice, but it's certainly worth the attempt. So if you have your passwords written down somewhere on a Wiki, in Google documents, or someone wrote them down on a piece of paper and stuck them on your monitor, this is a good time to stop.
3. security audit
The post-hack analysis and security audit of the servers that "broke" needed to happen as quickly as the previous two points. After all, getting operational capability on a compromised architecture is asking for more trouble.
So we shut down everything you could live without, secured what you can't, and in the meantime ran the two independent audits I mentioned earlier. The goal was to remove all vulnerabilities by the end and to prepare a plan to raise security.
This step of removing vulnerabilities is unfortunately often overlooked and companies stop at treating the symptoms and upgrading security. It's a bit like building a big wall and leaving the door open. Pointless.
4. implementing the remediation plan
We come to the most important part of the article, which is our steps we took to be less "vulnerable" in the future and by the way, advice for you to learn from our mistakes.
- Restricting access to servers - this is a great example of taking shortcuts. Because someone once had no way to generate a key, so on some server we'll disable key login for a while. To quote Edgar Allan Poe - "Nevermore". We've checked all the servers we administer and reduced login to a minimum with maximum security.
- Data cleanup - we went through all the projects and all the servers for the data they store (for example, a chunk of the production database on a test server). Everything was deleted or anonymized.
- Removal of all passwords and accesses - we are now implementing a systematic way to securely manage passwords, which will sufficiently protect unauthorized access to passwords.
- Implementation of 2FA - two-factor authentication is an additional way to secure login with a one-time code that can only be generated on a device authorized by the user. We implement this way of logging in everywhere where technology allows it. If you have a system that stores passwords, this is definitely a must have in that place
- Separate servers - we want to avoid a situation where breaking one system allows access to another. Separating systems will definitely significantly reduce such risk, so we implement the "one server - one service" model.
- Implementing VPN - if the system is only for our use, it will be hidden behind the company VPN.
- Cyclical manual and automated testing of all environments - we will subject literally everything in our care to cyclical security checks. If a new vulnerability hits us, we want to be the first to know about it.
- Security policy implementation - we are now implementing a set of security and access rules that will give us better control over the accesses we are granted and at the same time a set of rules that will allow us to maintain a certain standard.
- Internal communication - everything we are implementing now will go as a manual for onboarding new employees. Existing employees will be trained and have a real say in the future development of security.
I'd like to say that points x,y,z are super important, and g,i,k can be dispensed with. But that's not the case.
I'll say more. Such a policy has no right to work unless you show enough determination to implement it. That is, if you don't treat security as a process. In the end, everything will start to rust and again someone will give someone access "for a while", or will quickly install the project on the first server from the edge and will not pay enough attention to security.
I would like to say that thanks to the changes we will be "premium".
But it won't be true.
Thanks to the changes we will be what we should be from the very beginning. I'm sure of one thing - in our case, this was the first time in 10 years, and I don't have to convince you that we're damn motivated to make it the last. And to everyone reading, I wish they never had to write an article like this.