In this article I will try to explain by my own words every single thing that I learned during my Computer Engineering degree when it comes to security of software systems.

01 - Cryptography

Definitions

Here are some basic concepts:

  • Cryptography: The study of techniques of altering the representation of a message.
  • Cleartext: Original message. It is legible.
  • Ciphertext: Illegible text. It is the result of encryption performed on plaintext.
  • Cipher: Algorithm that transforms cleartext into ciphertext.
  • Key: Sequence of symbols that the ciphers use.

Symmetric cipher

The cleartext is encrypted and decrypted using one key.

Symmetric cipher diagram

This technique relies on these factors:

  • It is impossible to recover the original text knowing exclusively the ciphertext.
  • The shared key is secret.
  • It is possible to physicaly implement the algorithms.

These are some classic techniques that symmetric ciphers use:

  1. Substitution: The symbols of the cleartext are substituted by others. There are several types:
    • Simple Monoalphabetic Substitution: Each symbol of the cleartext is replaced by another one.

      An example of this cipher is the caesar cipher:

      cleartext:         TREE
      ciphertext:        AYLL
      
    • Polygraphic Monoalphabetic Substitution: Several symbols are replaced by several others.

      An example of this type of cipher is the Play Fair cipher:

      cleartext:         TREE
      ciphertext:        ODKUKU
      
    • Polyalphabetic substitution: Use several monoalphabetic substitutions as you encrypt the cleartext.

      An example of this type of cipher is the Vigenère cipher:

      cleartext:         TREE
      key:               ABCD
      ciphertext:        TSGH
      
    • One Time Pad: Use a key as big as the cleartext. The key is independent from the text.

      To encrypt and to decrypt you use the XOR function:

      cleartext:         TREE
      key:               QWER
      ciphertext:        JNIV
      key:               QWER
      cleartext:         TREE
      
  2. Transposition: The symbols of the cleartext get permutated. In other words, it reorders the symbols taking them by blocks.

    This is tribial to cryptoanalyze, so that’s no good my friend.

ENIGMA machine:

A set of 3 cylinders that rotate on their own axis. In each cylinder there are 26 contacts (as for 26 letters), and each contact is connected to another one of the next cylinder. Each cylinder is a cryptosystem that does Polyalphabetic Substitution of period 26. Each cylinder is connected to the next, as a cascade.

There are 26x26x26 = 17573 substitution alphabets before the system loops.

With 5 cylinders, that number scalates up to 11881376. This is the base of the crypt command of UNIX.

Standard encryption

DES (which descents from the LUCIFER cipher, created by IBM) was adopted by the NIST in 1977 to be the Data Encryption Standard.

It encrypts 64 bit blocks using 56 bits keys.

Throughout 19 stages and 16 iterations it transforms 64 bits of the cleartext into 64 bits of the ciphertext.

Being a symmetric cipher, the algorithms used in encryption and decryption are the same.

The following scheme shows the bit-level algorithm:

DES diagram

The DES cipher can operate in four different modes, in order to be able to encrypt blocks of data of different lenghts. These modes of operation are the following:

  1. ECB: Stands for Electronic CodeBook: The simplest mode, as it ciphers 64 bit blocks.
  2. CBC: Stands for Cipher Block Chaining: Uses the last encrypted block to do a XOR operation against the current block to encrypt. For this reason, it needs an initialization block that has to be known between both the encryption and the decryption algorithm.
  3. CFB: Stands for Cipher FeedBack: This mode allows the encryption algorithm to encrypt data of any size. As the last mode, this one needs an initialization block too, as it needs the last encrypted block to fill the empty bits of the 64 block. If the block to cipher is smaller than 64 bits, the remaining bits get filled with the bits from the previous encrypted block, resulting in a 64 bit block that can now be encrypted.
  4. OFB: Stands for Output FeedBack: Works the same as the Cipher FeedBack, but instead of chaining the encryption after the XOR operation, it does it before it. The advantage of this mode is that a transmission error in one block does not affect the rest of the blocks.

3DES: Stands for Triple DES, as it uses three 56 bit keys (one for each stage). Two of those keys are the same (for the even stages, the first and the third) and one is unique (for the odd stage, the second).

Its time complexity is 2^(120-log2(n)) such as n=plaintext lenght.

IDEA

It stands for International Data Encryption Algorithm, uses 64 bit blocks and 128 bit keys.

It has 8 iterations and a final stage that transforms everything into a 64 bit ciphertext block and the same modes of operation as the DES algorithm.

AES

Winner of the 1997 public contest promoted by the NIST whose purpose was to replace the 3DES algorithm.

This algorithm is a part of the Rijndael algorithm (Joan Daemen & Rijmen), as the Rijndael allows several different block and key sizes.

The official release of the AES can be found here.

Some particular attributes of this algorithm are the following:

  • Works with bytes.
  • Does operations with the body F256.
  • Uses its own arithmetic operations (sum, product).
  • Key size can be 128, 192 or 256 bits.
  • Block size of 128 bits.

Asymmetric cipher

Using two keys instead of one, it supposes a revolution in cryptology, as it solves two problems that are complex to solve with a secret key:

  • Key distribution.
  • Digital signature.

It is based on a series of new elemental transformations that were discovered at the time. Instead of transformations and substitutions, mathematic transformations are used.

The requirements for these type of algorithms are the following:

  • The keys can be different.
  • One algorithm and key for encryption and another pair for decryption.
  • Impossible to compute one of the keys knowing the other one and the cipher algorithm.
  • Optionally, the keys are interchangeable.

The basic scheme is presented in the next diagram:

Asymmetric cipher diagram

Diffie-Hellman rules:

  • Easy for a computer to generate a pair of keys (public and private).
  • Easy encryption, in computing terms.
  • Knowing the private key, the decryption must be easy.
  • Impossible to obtain the private key from the public key.
  • Impossible to obtain the cleartext if only the public key is known.
  • Optionally, the encryption and decryption algorithms can be applied in any order.

If the last requirement is met, now we are able to use this algorithm for:

  • Authentication: Identification and validation.
  • Digital signature: The message is signed with the private key, and only if the signing person is the one that claims to be, the signature validates using the public key.
  • Does not guarantee confidentiallity.
  • Guarantees integrity.

RSA

One of the most used ones, since it was created in 1977 by the MIT investigators Ron Rivest, Adi Shamir, Leonard Adleman.

It basis its robustness in the complexity of big numbers factorization, wich is really good, but it has its cons:

  • It uses very long keys when compared with the ones that the symmetric encryption ciphers use.
  • The algorithm is very demanding in computing terms.

The discrete logarithm problem

Given values for a, b and n such as n is a prime number, the function x = a^b mod n is very easy to compute.

But if you know the values of x, a and n, finding the value of b is very hard to compute if the values of x, a and n are very large.

This is the basis of the public key cryptography.

Prime tests

To check if a very large number is prime takes a lot of time (with large numbers N, the estimated distance between prime numbers is ln(N)), so there are several different tests we can do:

  • Miller-Rabin probability test.
  • AKS algorithm (since 2002).
  • Extended Euclides algorithm.

02 - Security services

Confidentiality

This is met with both types of encryption (symmetric and asymmetric).

Integrity

This guarantees that the received message has not been modified and is the exact same message that was sent.

To do this, a one-way function (also known as hash function) generates a code that represents the message that will be sent.

The hash is then stored and compared to the hash of the received message.

Message authentication

It’s the ability to guarantee the identity of the sender of the message.

On paper it’s done via autograph.

With symmetric encryption it can be done several ways:

  • Checksum.
  • MAC (Message Authentication Code).
  • Hash + key: Hash + key diagram

Non-repudiation

This means that a person cannot deny:

  • That this person sent the message (this is done with digital signature).
  • That this person received the message (e.g. this is the double tick that WhatsApp has).

Entity authentication

The authentication is done by using a piece of information (generally a key or a password) that the agent that wants to authenticate has.

Challenge-response authentication: The verifier sends a challenge to which the response must be a function applied to that. It can be done using:

  • Public key schemes.
  • Digital signature schemes.

2FA for people: After the user has successfully introduced his access credentials, the system needs more information to let the user in. This can be one of three basic categories:

  • Something that the user knows.
  • Something that the user has.
  • Something that the user is.

Hash functions

A hash is a function that assigns a fixed length value to data of any length.

A good hash function must meet the following requirements:

  • H can be applied to messages of any length.
  • H produces output of a fixed length.
  • H(x) is easy to compute.
  • For a given hash h is not feasible to find m, such that H(m)=h.
  • For a given block x is not feasible to find y, such that H(x)=H(y).
  • It is computationally infeasible to find x and y, such that H(x)=H(y).

Hash vs CRC

A hash is a one-way function, and it is designed to make difficult to find an entry that produces certain output value.

A CRC is designed to detect accidental changes in the data. Its purpose is not to protect against changes, but to detect them.

Some hash functions

MD5: Improvements over MD4 and MD2, slower but more secure. Produces 128 bit output.

SHA-1: Secure Hash Algorithm, published in 1994. Similar to MD5 but this one produces 160 bit output.

SHA-2: Set of functions (SHA-224, SHA-256, SHA-384, SHA-512), published in 2001.

SHA-3: Set of functions published in 2015. Just a standard, not in use yet.

Digital signature

Properties:

  • Able to verify author, date and time.
  • Authentify content at the time of the sign.
  • Verifiable by third parties.

Requirements:

  • Bit pattern independent from the message to sign.
  • Signature issuer information to prevent falsification and impersonation and negation of the signature.
  • Easy to generate.
  • Easy to recognize and verify.
  • Impossible to fake (nor signature or message).
  • Must be practical to store a copy.

A message can be signed by more than one person, an it also can be signed by a supervisor.

Key management

Symmetric key distribution

There are several posibilities:

  • Use of session keys.
  • KDC (Key Distribution Centre).
  • 3 way protocol

Public key distribution

To distribute public keys, the possibilities are the following:

  • Public announcement: A secure channel is needed.
  • Public available directory: A big, reliable organization takes care of the manteinance.
  • Public key authority: Mantained by an authority. Need of having a trustable public key issued by the authority. The authority sends keys to the users that request them.
  • Public key certificate: X.509 certificates are a file digitally signed by a Certification Authority (CA). It links some data to an identity. Both sender and receiver trust the CA. A PKI standard is X.509.

CA

A CA is a trustable organization responsible of issuing certificates for users or servers.

Local scope: Enterprise, campus or country: e.g. in Spain, FNMT and DNIe. Autosigned certificates.

Global scope: We trust a certificate if it is signed by a trustable authority that we all trust. Two types of certification authorities networks, tree (PKI) and distributed (keyrings).

Secure communication protocols

SSL

Standing for Secure Socket Layer, it was designed by Netscape Corporation for their Internet Browser.

Works on the transport layer (TCP).

The services it offers are the following:

  • Data compression.
  • Security: (Parameter negotiation, client-server authentication, data integrity and confidentiality).

Stages:

  1. Handshake: The parameters of the algorithms and the key length are determined between the both parts of the communications. The public keys are exchanged. The authentication is made via certificates.
  2. Transference: Symmetric key determination and encrypted data exchange.

TLS

Based on SSL 3.0, but not compatible with it.

In contrast with SSL, TLS can reuse an already existing TCP connection, so it does not need dedicated ports to work. It is inmune to Man In The Middle type of attacks.

IPsec

Collection of security protocols at network layer (IP).

Two modes of operation:

  • Transport mode: Protects the information send by the transport layer, this is, it only protects the TCP payload.. This mode is useful on end-to-end communications.
  • Tunnel mode: Protects the original IP datagram, this is, everything. This mode is useful if one of the ends does not support IPsec, e.g firewall, VPN.

IPsec working modes

It has three protocols:

  • AH (Authentication Header): It provides origin authentication and integrity, but not confidentiality.
  • ESP (Encapsulating Security Payload): It provides origin authentication, integrity, and confidentiality too.
  • IKE: Security Asociations (SA). One-way relationship. For a two-way communication we use two SA, and one of them establishes the first time that a datagram is interchanged. This converts a connectionless protocol into a connection oriented one.

IPsec SA two way

03 - Operating systems security

Definitions

  • Threat: Any situation that endangers the security.
  • Vulnerability: Weakness that is susceptible of producing an error.
  • Exploit: Technique that allows the atacker to take advantage of certain vulnerability to break the security of a system.
  • Social engineering: The art of manipulating people so they give up confidential information.
  • APT: Advanced Persisten Threat.
  • Botnets: Net of compromised, infected computers that can be used to perform distributed attacks.
  • Risk: Latent probability of a security incident taking place.

Vulnerabilities

Every system has vulnerabilities.

There are some strategies against them:

  • Security backups.
  • Risk analysis.
  • Suspicious events detection.
  • Constant revision of the security of the organisation.

The vulnerabilities must be classified. For that we have:

  • CVE (Common Vulnerabilites and Exposures): A unique ID is assigned to every vulnerability that is found, so they can be classified and origanised. For example CVE-2017-0144 (Eternalblue).
  • CVSS (Common Vulnerability Scoring System): A system designed to classify vulnerabilities based in their attributes and their possible effects.

Operating systems

Every OS provides tools and mechanisms to guarantee the security of the system.

User management

The users must have the lowest privilege they need to operate the system and they must belong to only the necessary groups.

Filesystem

Needs:

  • Confidentiality.
  • Disponibility.
  • Integrity.

Protections:

  • Encrypted filesystems. They require the password at boot time.
  • Secure file deletion. Tools like Scrub and Shred.

Types of alterations:

  • In the data.
  • In the programs. These ones are very dangerous.

Random alterations

Hardware alterations can be, for example:

  • Memory, disks, USBs… Those can be prevented by using RAID architecture and doing regular backups.
  • Power supply. A UPS prevents this of happening.

Software alterations are caused by:

  • Bad program design.
  • Programs in an inconsistent state.
  • Users with wrong privileges.

Alterations prevention

To prevent alterations from happening, several things can be done:

  • Use digital signature to check the authenticity of a file.
  • Use CRC and hashes to verify the integrity of a file.
  • Journaling: Log almost everything that happens in the system.

Log files

Their goal is to monitorize the system so in case something bad happens we can look the logs to figure out what the cause of the problem was.

The logs:

  • Store important events.
  • Can be local or remote.
  • Detect errors.
  • Are produced by programs like Snare, ObserveIt, LogAnalyzer…
  • In Linux they are store in the /var/log directory:
    • syslog
    • messages
    • auth.log
    • utmp
    • wtmp
    • btmp
    • lastlog
    • debug
    • apache
    • daemon
    • kern.log
    • user.log

Access control

The goal is to authenticate that someone is who they say they are.

To do that, we check for something that:

  • They know.
  • They have.
  • They are.

The authentication system must satisfy several characteristics:

  • Very reliable.
  • Economic.
  • Stand strong against certain attacks.
  • Acceptable by the users.

Password authentication system:

  • Simple and cheap.
  • The responsibility lies with the user.
  • A hash of the password is stored.
  • If the passwords are not hashed with salt, the hashes can be susceptible to a Rainbow table attack.
  • To create strong passwords there are several systems out there, e.g. Diceware.

Card authentication system:

  • Can be chipcards or smartcards.

Biometric authentication system:

  • Iris, palm of the hand, fingerprint,…

Secure programming

Secure code development, without vulnerabilities.

Vulnerabilities

In order to detect them, most of the time a pentest or a security audit is necessary.

Types of vulnerabilities:

  • Buffer overflow: Until 2004 they were the cause of half of the total discovered vulnerabilities.

    To get rid of them, there are several things that can be done:

    • Never trust the user inputs.
    • Disable code execution on the stack.
    • Use stackguard.
  • Race conditions: Not use of critical sections. To fix them, use the tools that the operating system gives you e.g. semaphores.
  • Common programming errors: Improper file management, not checking the inputs correctly, XSS…
  • SQLi: One of the most common vulnerability that webpages have. Always check the user input before doing consults to a database with it.
  • Rootkits: Persistent threat that provides the attacker root privileges when wanted. Very hard to detect, as they work at kernel level, but there are several tools like chkrootkit and rkhunter.

04 - Network security

The Internet comes with challenges that were never thought about, as it allows everything to be available from anywhere in the world 24/7.

The access control becomes harder. New security needs appear, to make safe services that were not made for so. e.g. WEP, GSM.

Types of attacks

Active, that can be easily detected but are very hard to prevent:

  • Interruption.
  • Fabrication.
  • Modification.

Passive, that are very hard to detect, but can erradicated with encryption and protections:

  • Interruption.

Network security

Only enable the necessary services and take countermeassures against their vulnerabilities.

A firewall controls the packets that enter and exit the system.

The network services can be:

  • Independent: Like any other program
  • Managed by inetd: The daemon inetd wakes up the process when needed, when not, it shuts them down. The behaviour can be configured in the file /etc/inetd.conf

TCP wrappers

In order for this work, the programs must be compiled with libwrap support.

In the file /etc/hosts.deny you specifiy the denied access, and in /etc/hosts.allow you specify the allowed access.

Sysctl

This allows us to communicate with the kernel in execution time. By editing the file /etc/sysctl.conf we can do things like:

  • Ignore all ping requests: net.ipv4.icmp_echo_ignore_all=1
  • Ignore all ping broadcasts: net.ipv4.icmp_echo_ignore_broadcasts=1
  • Refuse to send packets with invalid IP addresses: net.ipv4.conf.all.rp_filter=1 (Can be 0,1 or 2).
  • Log the packets with an invalid IP: net.ipv4.conf.all.log_martians=1

To apply the changes run sysctl -p

Service checking

To check the network services that are running, tools like netstat and nmap can be used.

Attacks

DoS

Connectivity lost due to port, network or resources saturation. Up to 3 years in jail.

IP spoofing

Send packets with a fake origin IP.

If you send ICMP ECHO REQUEST packages with the victim IP as the origin one, this attack is called smurfing.

Nowadays the routers don’t allow to send broadcast datagrams outside their subnets of reach.

ARP spoofing, poisoning

Change the IP linked to a MAC. This can be done in a switched LAN.

To discover the hosts that are in reach: arpscan -a

To poison two victims a tool named arpspoof can be used, in conjunction with echo 1 > /proc/sys/net/ipv4/ip_forward

Countemeasures:

  • Router with static MAC.
  • Use arpwatch to monitorize the network.

TCP SYN flood

Takes advantage of the 3-way-handshake, as the server allocates resources when a SYN packet is received, and they are not released after 75 seconds have passed. It results in a OOM most of the time.

This can be erradicated:

  • SYN cookies: Only allocate resources when the final message is received.
  • SYN cache: Independent structure that can’t grow infinitely. By default in FreeBSD.

UDP flood

Send packets to random ports, and cause the victim to send ICMP destination unreachable messages back to the attacker (or fake the origin IP and put another target).

DNS spoofing

Return a fake IP after a DNS query is made and captured by the attacker.

Web spoofing

These days this is called phising.

Mail spoofing

Saying in an email that the sender is another person.

MITM

Signifficant attacks when Diffie-Hellman without authentication is being used.

Iptables

Iptables is a tool that the Linux system provides to manage netfilter, which is a very powerful packet manipulation framework provided by the kernel.

It is a stateful firewall. This means that it will only examine the first packet of a connection, make a decission, and treat the rest of the packets the same way.

The framework consists of three main tables:

  • Filter
  • Mangle
  • Nat

It also has another two tables:

  • Raw: For managing the state of packets (as netfiler is a stateful firewall).
  • Security: Only used to set internal SELinux security context marks on packets.

And each table has several chains linked to it (not every tables has all the chains):

  • PREROUTING
  • INPUT
  • FORWARD
  • OUTPUT
  • POSTROUTING

The following diagram represents the flow of the packets through the chains:

IPtables tables

There are several options that can apply to a packet:

  • ACCEPT
  • DROP
  • REJECT
  • LOG
  • SNAT
  • DNAT
  • MASQUERADE

Tools

Ettercap, now replaced by Bettercap.

Packit: A tool to craft network packets and to do tests with.

Hping3: Like ping, but better.

Sniffer detection

This is a very hard thing to do, but there are some tools that can help you to do so, like Sniffdet.

To make if more difficult to sniff network traffic, here are some things that can be done:

  • Network and hosts segmentation using switches (but you gotta be careful with ARP poisoning).
  • Encrypted communications.

SNORT

Snort is an IDS (Intrusion Detection System), specifically a NIDS.

It’s got filters, rules, abnormal events detector and a module for making reports and managing alarms.

05 - General concepts

General culture concepts and knowledge:

  • Mirai botnet: Largest DDoS attack in history
  • Wannacry: A classic
  • Stuxnet worm: Targeted attack on Iranian nuclear facilities
  • Security basis: Confidentiality, integrity and availability
  • Other security basis: Authenticity, accountability and non-repudiation
  • More security basis: Privacy, anonymity, untraceability, unlinkability, unobservability
  • Protection basis: Prevention, detection, reaction
  • Software vulnerabilities exist for a reason and cannot be completely eliminated, but they can be avoided

06 - x86 ISA

It is necessary to know how processors work to be able to understand how vulnerabilities can be exploited.

Here will be explained the x86 and the x86_64 ISA of Intel processors, as Intel is the most common brand of processor (66% of the market).

Program compilation process

The process of compiling a program written in C/C++ is as shown in the following image:

Compilation process in C/C++

If you compile a C program with the option -save-temps, gcc won’t delete .i and .s files:

┌─[javier@torre]─[~]
└──╼ $cat sample.c
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char * argv){
        fprintf(stdout, "[*] This is a sample program\n");
        return 0;
}
┌─[javier@torre]─[~]
└──╼ $gcc -save-temps sample.c -o sample
┌─[javier@torre]─[~]
└──╼ $ls -l
total 76
-rwxr-xr-x 1 javier javier 16656 Dec 12 23:05 sample
-rw-r--r-- 1 javier javier   148 Dec 12 23:05 sample.c
-rw-r--r-- 1 javier javier 42336 Dec 12 23:05 sample.i
-rw-r--r-- 1 javier javier  1632 Dec 12 23:05 sample.o
-rw-r--r-- 1 javier javier   593 Dec 12 23:05 sample.s
┌─[javier@torre]─[~]
└──╼ $./sample
[*] This is a sample program

Assembly code is written using mnemonics. To demostrate it, the following is an example of assembly code for the Motorola 6809 processor (the first assembly language I learned):

; hello.asm: Simple program that prints the string on the screen

        .area PROG (ABS)

        .org 0x100          ; Start at 0x100
string: .ascii "Hello sir"
        .byte   10          ; 10 = CTRL+J = \cr
        .byte   0           ; 0  = CTRL+@ = \lf

        .globl program      ; Here starts the code
program:
        ldx #string
loop:   lda ,x+
        beq end
        sta 0xFF00          ; Print on the screen
        bra loop
end:    clra
        sta 0xFF01

        .org 0xFFFE         ; RESET vector
        .word program

Concepts

Machine code: Code that can be directly executed by the computer without further translation.

Bytecode: Code that can be executed by a virtual machine, for example, Java creates this, and then the Java virtual machine runs it.

Opcode: Number that represents an operation (OPeration CODE).

Mnemonics: Instructions in assembly language. It can be a string or an opcode with zero or more arguments.

x86 registers

A register is a form of storage that the processor has. They’re really fast to access and to operate with them, and they are different from the main memory, as they are inside the processor.

There are regiters for different kinds of things:

  • General purpose: Used for storing immediate values and memory addresses.
  • Segment: Used for identifying segments in memory.
  • Program status and control (flag registers): They store the flags that indicate the result of an arithmetic operation (overflow, zero, …).
  • Instruction pointer: Also named PC as for Program Counter

General purpose registers on x86

  • EAX: Extended Accumulator register.
  • EBX: Extended Base register, a base pointer for memory access.
  • ECX: Extended Counter register, counter for loop/string operations.
  • EDX: Extended Data register, a pointer for I/O.
  • ESI: Extended Source Index pointer for string operations.
  • EDI: Extended Destination Index pointer for string operations.
  • EBP: Extended Base Pointer, a pointer to data on the stack.
  • ESP: Extended Stack Pointer.

The following image illustrates the mapping of the registers: x86 General purpose registers

Instruction pointer register on x86

The EIP (Extended Instruction Pointer) points to the next instruction to be executed. It can only be accesed by using branch instructions (call, jmp, ret).

Program status and control (flag registers)

The EFLAGS register has several flags:

  • CF: Carry Flag.
  • PF: Parity Flag.
  • ZF: Zero Flag.
  • TF: Trap Flag.
  • OF: Overflow Flag.
  • SF: Sign Flag.

Data types

There are several data types, each one being the double of the previous:

  • Byte: 8 bits.
  • Word: 16 bits.
  • Double word: 32 bits.
  • Quad word: 64 bits. (Combining EDX and EAX into one)

x86 endianness

The x86 uses the little endian format to store information in memory. This means that, for example, the word x28A1427F will be stored as x7F42A128.

Data movement

To move data, the x86 has the MOV instruction:

  • Immediate to register: mov eax, 0x41. This will put the value 0x41 in the EAX register.
  • Register to register: mov eax, ebx. This will put whatever is on the EAX register into the EBX register.
  • Immediate to memory: mov [eax], 0x41. This will put 0x41 in the memory address that the EAX register contains.
  • Register to memory: mov [eax], ebx. This will put whatever is in the EBX register in the memory address that the EAX register contains.
  • Memory to register: mov eax, [ebx]. This will put whatever is in the memory address that the EBX register contains into the EAX register.

LEA

This instruction will Load the Effective Address into a register. Example: lea eax, [ebx+0x04]

Arithmetic instructions

add eax, ebx

sub eax, ebx

inc edi

dec esi

Bit-level instructions

and eax, ebx

or eax, ebx

xor ebx, eax

not eax

Jump instructions

  • Unconditional:

    jmp eax

    jmp [ebx]

  • Conditional:

    Preceeded by cmp eax, ebx most of the time.

    jle eax, ebx

    jz ebx, 0x01

More info about instructions

  • https://en.wikibooks.org/wiki/X86_Assembly/Control_Flow

  • https://en.wikipedia.org/wiki/X86_instruction_listings

Memory segmentation and stack operations

In order to exploit security bugs, most of the time you’ll have to overwrite or overflow a portion of the memory into another one.

The following image illustrates the structure of a C program:

C program memory diagram

Stack

The stack is a LIFO structure, and it grows downwards.

It has two operations, push and pop.

The ESP points to the address of the oldest pushed data that is currently on the stack.

Function calls

To call a function, you use the call instruction.

Once you call a function, the return address (next instruction after the function ends) is pushed to the stack.

The return of the function is performed via the ret instruction.

Before calling a function, there is what is called a function prologe. Its purpose is to backup the selected registers and save space for the local variables.

After calling a function, everything that was done in the prologue is done backwards before issuing the return.

Function invocation

  • Windows:

    First four arguments are loaded into RCX, RDX, R8, R9.

    The rest of them are passed trough the stack from right to left.

  • Linux:

    First six arguments are loaded into RDI, RSI, RDX, RCX, R8, R9.

    The rest of them are passed trough the stack from right to left.

Exercises

  1. Prepare the system:

    You’ll need a Linux system with gcc, gdb and nasm installed on it.

    You can do so by running the following command:

    apt update && apt install gcc gdb nasm -y
    

    Now configure gdb to show disassemblies in the intel format (we don’t want the AT&T format) by running this command:

    echo 'set disassembly-flavor intel' >> ~/.gdbinit
    

    If you want to use objdump for disassemblies, use the option -M intel to get the output in the intel format.

    An example against the sample program that we compiled before:

    ┌─[javier@torre]─[~/sample]
    └──╼ $objdump -M intel sample -d | head -15
       
    sample:     file format elf64-x86-64
       
       
    Disassembly of section .init:
       
    0000000000001000 <_init>:
        1000:       48 83 ec 08             sub    rsp,0x8
        1004:       48 8b 05 dd 2f 00 00    mov    rax,QWORD PTR [rip+0x2fdd]        # 3fe8 <__gmon_start__>
        100b:       48 85 c0                test   rax,rax
        100e:       74 02                   je     1012 <_init+0x12>
        1010:       ff d0                   call   rax
        1012:       48 83 c4 08             add    rsp,0x8
        1016:       c3                      ret
    

03 - Runtime attacks

Coming soon

05 - Defenses against runtime attacks

Coming soon

04 - Code reuse: Attacks and defenses

Coming soon

06 - Web security

Coming soon

07 - Blockchain

Coming soon

08 - Smart contracts

Coming soon

09 - Side channel attacks

Coming soon

10-Hardware security

Coming soon