Malware Detection Template

		Object Oriented Cybersecurity Detection Architecture (OOCDA) Suite

Executive Summary	Start	Author	Project Map

Malware Detection Template Analysis
Introduction:
Malicious software or Malware is the biggest of all Cybersecurity attacks and it is very difficult to detect. We cannot afford to think in the existing terms such as firewalls, encryption, scanning, secure sockets, ..etc, but we need to think in term of automation, Artificial Intelligence (AI), Machine Learning (ML), virtualization and integration.

Waiting for Cybersecurity vendors to do our job and protect our livelihood and day-to-day business is a risk which hackers had exploited. The perfect example is the SolarWinds Hacking and Hidden in Plain Sight. Not to mention that SolarWinds is a major software vendor which provides system management tools for network and infrastructure monitoring, and other technical services to hundreds of thousands of organizations around the world.

We need to figure out Hackers' strategies plus in our in detection and prevention, we are introducing our Sam's Two Basic Rules and making the stolen data has value to hackers.

Definition:
Malware is an application, a file or code which infects any part of computer and mobile systems from end-to-end. Malware comes in so many variants, there are numerous methods to infect computer/mobile systems.
Image #1 is our attempt to put a picture to what we are dealing with:

Malware Attack Levels

Image #1

Existing Malware:
Cybercrime is a billion-dollar business (does crime pay?), plus the political warfare and state sponsored hacking is real. Malware is nothing but malicious applications, programs, code and tools and the internet is its backyard. Hackers try to find system vulnerabilities. They change and adopt their tactics to offset any new cyber defenses. They are using Artificial Intelligence (AI), Machine Learning (ML) and their code is Hidden in Plain Sight.

Entry Points:
An Entry Point is a place where a person or thing enters. Speaking of cloud systems or cluster of networks, which would have over 100,000 element running and any element on the network can be a possible entry point and a source of potentially malicious. Hackers had exploited such vulnerabilities.

Our Answer to such Vulnerabilities:
We believe the answer to such vulnerabilities is:

       • Think like hackers - get in their heads
       • Figure their strategies
       • Figure out their tools and teamwork (internal and external)
       • Address and handle state-sponsored attacks
       • How Artificial Intelligence (AI) and Machine Learning (ML) are used
       • Keep update on hackers latest and greatest
       • Keep one-step ahead of hackers' attacks and approaches

Malware Strategies (Know Your Adversaries):
Looking at Images #1, there is almost an endless number of Entry Points starting with user side (mobiles or browsers) to Operating System (OS) running on the bare-metal. Malware can start from any point and/or any end.

Its main goals:

       1. Copy itself on any element with perpetuation
       2. Keep IP address or connection for further attack
       3. Hide in plain site
       4. Use multiple attacks to hide their main goal
       5. Use sheer numbers to overwhelm their victims
       6. Put their code to sleep until right condition
       7. Intercept
       8. Find vulnerabilities
       9. Use Reverse Engineering to find vulnerabilities
       10. Use human factor to gain access
       11. Try and try again until success
       12. Nothing to lose even if hackers' attacks are detected

From the above and not including the human factor, hackers actions can be categorized as follows:

       1. Find vulnerabilities
       2. Copy
       3. Hide
       4. Execute
       5. Intercept
       6. Overwhelm
       7. Return for further attacks
       8. Use intelligence
       9. Persistence
       10. Teamwork (outsiders and insiders)

Simplifying Malware Detection:
The best detection of malware is prevention and trapping. Preventing Malware from running and trapping malware code. We need to use Operating System and system memories to perform both prevention and trapping.

Malware Code Malicious Program

Image #2

System #1 in Image #2 represents what would a running system with CPU, Main Memory (Core) and Resources (Peripherals). Main Memory is where all the programs (processes) including the Operation System would be running. The Heap is what can be called open memory space for all programs to use in case they need more memory.

System #2 in Image#2 represents the same thing, but one of the processes or program is a malware code or malicious program. There is NO Built-in procedure which can tell if the running program or process is a malware.

Only the operating system has the power to terminate any processes-programs including malware processes. In the case where malware takes over operating system, then nothing can stop the malware from running the show. Therefore, it is critical that the operation system to be in charge and able to stop any programs including malware. We need to focus on the operating system and memory to detect and prevent malware.

Types of Operating Systems (OS):
OS is a set of programs and each program is composed of statements. There are a number of OS based on the type of services their performance.
The following is a set of OS types:

       1. Batch OS
       2. Distributed OS
       3. Multitasking OS
       4. Network OS
       5. Real-OS
       6. Mobile OS
       7. Hypervisor
       8. Routers - A router is an Embedded system category, it is a computer system with limited number of tasks
       9. Browser - the browser is the executor of code similar to OS

These operating systems (OS) must be protected plus they should be used in detecting malicious code or statements. Our protection and detection should not affect OS performance also.

Computer and Mobile Memory:
Computer Memory is nothing but a structured array of bytes. To prevent hackers from uploading and running their code or malicious programs, we need to perform the following:

       • Scan for malicious code
       • Track both running programs, programs' functionalities and their size
       • Remove and delete any terminated code or programs
       • Trap any suspected or malicious statements
       • Set red flags for deletion of any sort malicious statements or code

How to used OS and memory in detection of malware?
We need to develop programs in shell scripts (Linux, Unix, and Windows), C and C++ code to be added to the OS-Kernel code. Both developers, architects and testers need to team up and start build such utilities and services-engines.

Sam's Two Basic Rules:
A running program is composed of statements which are executed by the operation system. For any program or code to run, it needs:

       1. Programs must loaded in memory to be executed by OS
       2. OS executes the program's statements

Therefore to stop any code or program's statements from being executed is to eliminate at least one of the following:

       1. Memory space to upload the code - program statements
       2. OS to execute program statements

In case of hackers code, we need to do the following:

       1. Guard memory
       2. Stop OS from executing any malicious code or statements

Sam's Two Basic Rules are easier said than done, due to numerous factors. For example, hackers may add their malicious code in the middle or end of any program or load the code directly in memory to be executed by OS. Hackers can also hack OS and hence they are running the show. Not to mention guarding memory and monitoring OS execution may affect system performance.

Our Strategies:
Our strategies are:

       • Using Sam's Two Basic Rules
       • Made Data of no Value to hackers or thieves

Using Sam's Two Basic Rules:
Malware strategy is to get copied and then get executed to do its tasks. Any program or code needs an operating system or browser engine to run its code. Malware can also insert its code in middle or part of other programs Therefore stopping any malware requires:

       1. Preventing malware from copying itself on any media
       2. Trapping malware code from executing
       3. Scanning for malware code in any suspected programs
       4. Make the data captured by malware has no value (compressed-encrypted)
       5. Remove malware from any media
       6. Use Machine Learning in detecting malware
       7. Track sites, countries, hosting services, known servers, tactics, ..etc to help in detection.
       8. Set up remedy and recovery processes for handling malware attack
       9. Keep statistics of malware attacks, strategies, patterns, ..etc and use them by Machine Learning
       10. Document all the above for future planning

Attack Type:
Malware can be any of the following:
Ransomware, spyware, Trojans, Adware, Remote Access, Virus, Worms, Rootkits, Keyloggers, Bots, Macro viruses File infectors, System or boot-record infectors, Polymorphic viruses, Stealth viruses, Logic bombs, Droppers,

How the attack is carried out?
Malware is when an unwanted piece of programming or software installs itself on a target system, causing unusual behavior. It can attach itself to legitimate code and propagate; it can lurk in useful applications or replicate itself across the Internet. Malware's goal is get into the system and propagate itself into any media. If malware cannot get in, then it would jam or overwhelm its victims preventing its victims from functioning.

Victims, Losses, Damages:
The ranges of malware damages can be denying access to programs, deleting files, stealing information, and spreading itself to other systems. Some forms of malware are designed to extort the victim in some way.

Damages - Victims:
Personal:
Perhaps the most notable form of malware is Ransomware – a program designed to encrypt the victim’s files and then ask them to pay a ransom in order to get the decryption key.

Machine - Servers and Storage:
Stolen data can be used to access servers, storage. .. etc.

Networks:
Stolen data can be used to access networks, storage. .. etc.

Data:
Critical data is at risk.

Software:
Not necessary software damages, but hacker may upload their viruses, code, ..etc

Others:
Stolen data and how it can be used to do what? It is an open question. Links may allow hackers to upload their viruses, code, ..etc.

Using Numbers, Formulas and Algorithms:
Malware programs and code are very difficult to figure out if any code is legitimate of not. The code can have nested functions, hashed functions and it is not practical to find out their intent. There is a number of patterns hackers use to confuse anyone of their real intent.

We need think in terms of Numbers, Formulas and Algorithms. Creating matrices with data would help in speeding processes and making intelligent decisions.

Our Machine Learning Automated Matrices Analysis Using Numbers, Formulas and Algorithms:
Our architect of our Machine Learning (ML) Tools is to provide fast analysis and decision-making processes. This means that our ML Tools would be able to scan any code and supply of "Red Flag" with score or value of being a hacking code. To make our architect presentable to our audience, we need to break down our architect-design into the following steps:

•	Create matrices of all built-in functions of each the following languages: 1. Linux, Unix, and Windows scripts 2. C and C++ 3. Python 4. Java 5. JavaScript 6. Assembly Instruction 7. PHP 8. .. etc
•	Create matrices of programming statements which hackers use often for each languages
•	Give each function a weight-value (red flag) of (approved, deny, allow, ..etc).
•	Perform a quick text search which is byte search or text and it is fast
•	For .exe file and executable code, we would be creating tools which perform fast reverse engineering of the code and create byte search-scan-able buffer (see the examples) to speed performance
•	Calculate Weight-Value number
•	Based on the probability (Red Flag) of having hacker code our ML Tools would make a calculated decision

For example, hackers would take an image file and add to this image their malicious code (.exe code) in any place in the image file. Our architect-design would perform the following sequences of scanning:

1	Fast byte-text scan looking for any byte sequence which does not represent pixel representation - First Red Flag
2	Copy the suspected code in a buffer (fast processing) and perform quick reverse engineering looking for malicious function - 2nd Red Flag
3	Based on our matrices analysis then our ML Tools decision-maker would take action
4	Red Flag with high scare is trapped and removed for further analysis so there is no loss of performance

Speed and performance is critical our processes.

Code Analysis Example:
Let us look malware code in C language in the Code Segment #1. Our reverse engineering of the code and create byte search-scan-able buffer and found the code in Code Segment #1. Our ML Tools analysis would be processing and create a buffer with the Code Segment #1 code. Our matrices would have function names and its weight-value of Red Flag hackers possible code.

Note: "jump" statement is not considered a good programming practice.

                     startAgain:
                     while connect(mySocket, (struct socketAddress *) &serverAddress, sizeof(serverAddress) != 0){
                                   sleep(60);
                                   goto startAgain;
                     }
                                   Code Segment #1

By performing text search of Code Segment #1, we would be able to calculate the following:

       1. Jump statement
              Red flag = 8 out of 10
       2. A while Loop
              Red flag = 2 out of 10
       3. Sleep function
              Red flag = 9 out of 10
       4. Jump within the loop
              Red flag = 10 out of 10
       5. Connection
              Red flag = 7 out of 10
       6. Testing for connection
              Red flag = 8 out of 10
       ________________________________________________
                     Red flag total = 44 out of 60

We definitely have more than 50% chance that Code Segment #1 is hackers' code or 44 our 60 which is 73%.
From above example, we definitely can make use of Numbers, Formulas and Algorithms to help with our detection.
The flag matrices needs a combined brainstorm of architects, analysts, developers, testers, security experts, management and anyone who would be able to make our detection more efficient.

Case Study:

              http://SamEldin.com/SamQualificationPages/ReverseEngineeringObjectives.html

Looking our presentation of " SolarWinds Hacking Lessons - Hidden in Plain Sight". From experience, updating or modifying someone else's code is not a small task. The added code must be tested to make sure it does what it suppose to do. Looking at the details and the total effort in inserting their malicious code, these hackers must have access or copies of DLLs, the interfaces and possible variations. Let us just look at some of hacker's code performance tasks:

8.Initialization	9.Hash functions for data	10.Hash functions for methods name calling
11.IP addresses	12.Tracing calls	13.Sleep functions for days
14.Threads sleep functions	15.Date and timestamp	16.Search function
17.OS calls	18.Zipping and unzipping function calls	19.Compression and decompression
20.Stop services from running	21.Receive instructions from outside sites	22.Interrupt or stop services
23.Had the ability to transfer files	24.Executed files	25.Profiled the system
26.Rebooted the machine	27.Disabled system services

Using our Code Segment #1 example and weight values, what we just listed SolarWinds Hacking code is more than enough to raise a Red Flag with weight-value higher than 50%.

Scanning For ".exe" Files or Executable Programs:
We are building detection based on trapping hacking files or code. We would be searching for code mostly are used by hackers or calculating the possibility of bad intent by these incoming files. The speed of the performance of our detection is critical. Based on type of programming languages hackers are using, we need to figure out file type. Some programming languages are interrupted while other are compiled into .exe and DLLs. Therefore, we need to build a tools to perform such scanning.

We are looking into using de-compilers-Reverse Engineering and text search to perform the scanning.
We need to team up with Reverse Engineering tools vendors and customized our scanning tools.
Machine Learning is a big part of scanning tools.

Tracking Memory Content and Running Programs:
We are working on building OS tracking processes. Our architect-design specs have a list of processes and checklist. We are building a number of utilities which would support any running OS tracking and trapping malware or any unauthorized program. We would be building in the following languages:

       1. Linux scripts
       2. Unix scripts
       3. Windows scripts
       4. C and C++
       5. Python
       6. Java
       7. or any language qualifies in our tracking and trapping

Applying Virtualization to Detection:
In order to apply virtualization to detection we need understand and use DevOps and DataOps

DevOps:
DevOps is the infrastructure and software support which would provide the running bare-mental servers with all needed hardware and software. Data backup such as NAS or files servers can also be provided by DevOps. DevOps should be utilizing Virtualization, Intelligence, Automation and Integration.

DataOps:
We recommend that DevOps and DataOps would be independent services. Building DataOps as an intelligent data storage services would help keep cloud system loosely coupled and integrate-able. DataOps services would be working as data buffers between the running infrastructure and data storage.

Tools:
Virtualization, Intelligence, Automation and Integration are virtual tools to help scale any cloud system vertically and horizontally. DevOps and DataOps would be using these tools to run and secure the running system.

Virtualization:
In short, Virtualization is using and automating DevOps to create Independent Virtual Machines with Virtual IP addresses. Each IP address is the connection to each Independent Virtual Machine. We call each Independent Virtual Machine a "Container" since it has its own operation system (OS) and all the supporting system software to run as a cloud virtual server.

For example, each Virtual Container (virtual cloud server) would be running the user-client A software as a "Component". A cloud application A (let us say Microservices application) would have its own Container A and all cloud application supporting software (components) would be running within Container A. To access Application A, user-client A would be using Virtual IP Address A. Once user-client A is done with Application A, then Container A and all the existing components would be completely detected our of the virtual memory. Plus virtual IP address A would not be used again (except after let us say 10 yrs from the running date).
Let us look at a Bank user request scenario in Virtual Container and Components Diagram #1:

DevOps-DataOps Virtual Containers' Support Diagram

DevOps-DataOps Virtual Containers' Support Diagram

Virtual Container and Components Diagram #1

Let us assume a bank customer is requesting bank services plus this bank customer has virus on his/her laptop. Our system would be creating a virtual server (only exists in memory) as a container and it contains a virtual bank service application as a component running within the virtual server. The virus main objective is to copy itself on the server for further attack. It also has the Virtual Server IP (in our case it is virtual IP address) address which the virus would use for future attacks. Regardless of the virus objectives, the virtual server, the virtual bank server application and Virtual IP address would be completely deleted from memory once the virtual bank service application is done. VIP address would also be deleted so no tracking or further attacks would be possible. The Virtual Data Buffer Synchronized Services would also protect the internal data structure and access from an unauthorized access.

Note: Nesting containers - not recommended

Virtualization Vertically Scaling Using Components and Containers:
Creating a virtual cloud server as a "Container" would be done as an automated service to users. The virtual containers (virtual cloud server) may contain any number of applications, but we recommend that it should be limited to only one component-application per user's request plus all the application's supporting services including DataOps. Therefore, the vertical scaling is open to help increase performance speed and options. For example, in Diagram #1, DevOps can add vertically within the container any number of supporting software or even operation system tools or utilities to speed performance and support. Again, adding any tools or software is scalable vertically.

Virtualization Horizontally Scaling Using Components and Containers:
The only limit on how many Virtual Containers with their components is the available memory on the bare-metal server. Horizontal scaling is open. In the case your bare-metal server runs out of memory, then DevOps service can add more bare-metal servers or give a remote IP addresses for more remote bare-mental servers to be used. The additions and remote access can also be automated without any staff (DevOps or development) involvement.

Intelligence:
We believe turning data into weight-value fields which can be checked, compared, cross referenced, analyzed and stored easier. Adding intelligence to analysis is doable where these numbers are used to make educated guess (intelligence) based on the values in weight-value fields. Plus these number would help in prediction and future decisions based the collected values.