New
This commit is contained in:
parent
edbb6a1c58
commit
d034098994
189
blog/asm/1.html
189
blog/asm/1.html
|
@ -3,7 +3,7 @@
|
|||
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
||||
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
|
||||
<head>
|
||||
<!-- 2024-03-22 Fri 14:08 -->
|
||||
<!-- 2024-03-23 Sat 15:57 -->
|
||||
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
||||
<title>x86 Assembly from my understanding</title>
|
||||
|
@ -23,9 +23,9 @@
|
|||
<p>
|
||||
Soooo this article (or maybe even a series of articles, who knows ?) will be about x86 assembly, or rather, what I understood from it and my road from the bottom-up hopefully reaching a good level of understanding
|
||||
</p>
|
||||
<div id="outline-container-orgb0eec26" class="outline-2">
|
||||
<h2 id="orgb0eec26">Memory :</h2>
|
||||
<div class="outline-text-2" id="text-orgb0eec26">
|
||||
<div id="outline-container-org9e6b7e3" class="outline-2">
|
||||
<h2 id="org9e6b7e3">Memory :</h2>
|
||||
<div class="outline-text-2" id="text-org9e6b7e3">
|
||||
<p>
|
||||
Memory is a sequence of octets (Aka 8bits) that each have a unique integer assigned to them called <b>The Effective Address (EA)</b>, in this particular CPU Architecture (the i8086), the octet is designated by a couple (A segment number, and the offset in the segment)
|
||||
</p>
|
||||
|
@ -40,9 +40,9 @@ Memory is a sequence of octets (Aka 8bits) that each have a unique integer assig
|
|||
The offset and segment are encoded in 16bits, so they take a value between 0 and 65535
|
||||
</p>
|
||||
</div>
|
||||
<div id="outline-container-org57cd217" class="outline-4">
|
||||
<h4 id="org57cd217">Important :</h4>
|
||||
<div class="outline-text-4" id="text-org57cd217">
|
||||
<div id="outline-container-orgb6ce3ec" class="outline-4">
|
||||
<h4 id="orgb6ce3ec">Important :</h4>
|
||||
<div class="outline-text-4" id="text-orgb6ce3ec">
|
||||
<p>
|
||||
The relation between the Effective Address and the Segment & Offset is as follow :
|
||||
</p>
|
||||
|
@ -52,8 +52,8 @@ The relation between the Effective Address and the Segment & Offset is as fo
|
|||
</p>
|
||||
</div>
|
||||
<ul class="org-ul">
|
||||
<li><a id="orgcbdf7c0"></a>Example :<br />
|
||||
<div class="outline-text-5" id="text-orgcbdf7c0">
|
||||
<li><a id="org24f01a3"></a>Example :<br />
|
||||
<div class="outline-text-5" id="text-org24f01a3">
|
||||
<p>
|
||||
Let the Physical address (Or Effective Address, these two terms are interchangeable) <b>12345h</b> (the h refers to Hexadecimal, which can also be written like this <b>0x12345</b>), the register <b>DS = 1230h</b> and the register <b>SI = 0045h</b>, the CPU calculates the physical address by multiplying the content of the segment register <b>DS</b> by 10h (or 16) and adding the content of the register <b>SI</b>. so we get : <b>1230h x 10h + 45h = 12345h</b>
|
||||
</p>
|
||||
|
@ -66,16 +66,16 @@ Now if you are a clever one ( I know you are, since you are reading this <3 )
|
|||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div id="outline-container-org758f05f" class="outline-3">
|
||||
<h3 id="org758f05f">Registers</h3>
|
||||
<div class="outline-text-3" id="text-org758f05f">
|
||||
<div id="outline-container-org9256db9" class="outline-3">
|
||||
<h3 id="org9256db9">Registers</h3>
|
||||
<div class="outline-text-3" id="text-org9256db9">
|
||||
<p>
|
||||
The 8086 CPU has 14 registers of 16bits of size. From the POV of the user, the 8086 has 3 groups of 4 registers of 16bits. One state register of 9bits and a counting program of 16bits inaccessible to the user (whatever this means).
|
||||
</p>
|
||||
</div>
|
||||
<div id="outline-container-orgbb2bc8f" class="outline-4">
|
||||
<h4 id="orgbb2bc8f">General Registers</h4>
|
||||
<div class="outline-text-4" id="text-orgbb2bc8f">
|
||||
<div id="outline-container-orgfc1a89c" class="outline-4">
|
||||
<h4 id="orgfc1a89c">General Registers</h4>
|
||||
<div class="outline-text-4" id="text-orgfc1a89c">
|
||||
<p>
|
||||
General registers contribute to arithmetic’s and logic and addressing too.
|
||||
</p>
|
||||
|
@ -126,28 +126,28 @@ Now here are the Registers we can find in this section:
|
|||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div id="outline-container-org6de3be7" class="outline-3">
|
||||
<h3 id="org6de3be7">Addressing and registers…again</h3>
|
||||
<div class="outline-text-3" id="text-org6de3be7">
|
||||
<div id="outline-container-orge6b17b9" class="outline-3">
|
||||
<h3 id="orge6b17b9">Addressing and registers…again</h3>
|
||||
<div class="outline-text-3" id="text-orge6b17b9">
|
||||
</div>
|
||||
<div id="outline-container-orgfe32dc7" class="outline-4">
|
||||
<h4 id="orgfe32dc7">I realized what I wrote here before was almost gibberish, sooo here we go again I guess ?</h4>
|
||||
<div class="outline-text-4" id="text-orgfe32dc7">
|
||||
<div id="outline-container-orgd76cd4e" class="outline-4">
|
||||
<h4 id="orgd76cd4e">I realized what I wrote here before was almost gibberish, sooo here we go again I guess ?</h4>
|
||||
<div class="outline-text-4" id="text-orgd76cd4e">
|
||||
<p>
|
||||
Well lets take a step back to the notion of effective addresses VS relative ones.
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
<div id="outline-container-org471cf7b" class="outline-4">
|
||||
<h4 id="org471cf7b">Effective = 10h x Segment + Offset . Part1</h4>
|
||||
<div class="outline-text-4" id="text-org471cf7b">
|
||||
<div id="outline-container-org86b6da3" class="outline-4">
|
||||
<h4 id="org86b6da3">Effective = 10h x Segment + Offset . Part1</h4>
|
||||
<div class="outline-text-4" id="text-org86b6da3">
|
||||
<p>
|
||||
When trying to access a specific memory space, we use this annotation <b>[Segment:Offset]</b>, so for example, and assuming <b>DS = 0100h</b>. We want to write the value <b>0x0005</b> to the memory space defined by the physical address <b>1234h</b>, what do we do ?
|
||||
</p>
|
||||
</div>
|
||||
<ul class="org-ul">
|
||||
<li><a id="orgf021c83"></a>Answer :<br />
|
||||
<div class="outline-text-5" id="text-orgf021c83">
|
||||
<li><a id="org63a0b4e"></a>Answer :<br />
|
||||
<div class="outline-text-5" id="text-org63a0b4e">
|
||||
<div class="org-src-container">
|
||||
<pre class="src src-asm"><span style="color: #89b4fa;">MOV</span> [DS:0234h], 0x0005
|
||||
</pre>
|
||||
|
@ -159,7 +159,7 @@ Why ? Let’s break it down :
|
|||
|
||||
|
||||
|
||||
<div id="orgf6af9f9" class="figure">
|
||||
<div id="org9e1af4e" class="figure">
|
||||
<p><img src="../../src/gifs/lain-dance.gif" alt="lain-dance.gif" />
|
||||
</p>
|
||||
</div>
|
||||
|
@ -177,9 +177,9 @@ Simple, right ?, now for another example
|
|||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div id="outline-container-org87609ad" class="outline-4">
|
||||
<h4 id="org87609ad">Another example :</h4>
|
||||
<div class="outline-text-4" id="text-org87609ad">
|
||||
<div id="outline-container-org704a2f5" class="outline-4">
|
||||
<h4 id="org704a2f5">Another example :</h4>
|
||||
<div class="outline-text-4" id="text-org704a2f5">
|
||||
<p>
|
||||
What if we now have this instruction ?
|
||||
</p>
|
||||
|
@ -192,9 +192,9 @@ What does it do ? You might or might not be surprised that it does the exact sam
|
|||
</p>
|
||||
</div>
|
||||
</div>
|
||||
<div id="outline-container-org6254e46" class="outline-4">
|
||||
<h4 id="org6254e46">Segment + Register <3</h4>
|
||||
<div class="outline-text-4" id="text-org6254e46">
|
||||
<div id="outline-container-org5f7abb9" class="outline-4">
|
||||
<h4 id="org5f7abb9">Segment + Register <3</h4>
|
||||
<div class="outline-text-4" id="text-org5f7abb9">
|
||||
<p>
|
||||
Consider <b>DS = 0100h</b> and <b>BX = BP = 0234h</b> and this code snippet:
|
||||
</p>
|
||||
|
@ -230,8 +230,8 @@ The General rule of thumb is as follows :
|
|||
</ul>
|
||||
</div>
|
||||
<ul class="org-ul">
|
||||
<li><a id="orgb5208d9"></a>Note<br />
|
||||
<div class="outline-text-5" id="text-orgb5208d9">
|
||||
<li><a id="orgea3e106"></a>Note<br />
|
||||
<div class="outline-text-5" id="text-orgea3e106">
|
||||
<p>
|
||||
The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset. Also you can’t write directly into the DS or CS segment registers, so something like
|
||||
</p>
|
||||
|
@ -246,9 +246,9 @@ The values of the registers CS DS and SS are automatically initialized by the OS
|
|||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div id="outline-container-org660d1f4" class="outline-2">
|
||||
<h2 id="org660d1f4">The ACTUAL thing :</h2>
|
||||
<div class="outline-text-2" id="text-org660d1f4">
|
||||
<div id="outline-container-orgb12dbb3" class="outline-2">
|
||||
<h2 id="orgb12dbb3">The ACTUAL thing :</h2>
|
||||
<div class="outline-text-2" id="text-orgb12dbb3">
|
||||
<p>
|
||||
Enough technical rambling, and now we shall go to the fun part, the ACTUAL CODE. But first, some names you should be familiar with :
|
||||
</p>
|
||||
|
@ -258,9 +258,9 @@ Enough technical rambling, and now we shall go to the fun part, the ACTUAL CODE.
|
|||
<li><b>Operands</b> : These are the options passed to the instructions, like <b>MOV dst, src</b>, and they can be anything from a memory location, to a variable to an immediate address.</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div id="outline-container-orgf7c0650" class="outline-3">
|
||||
<h3 id="orgf7c0650">Structure of an assembly program :</h3>
|
||||
<div class="outline-text-3" id="text-orgf7c0650">
|
||||
<div id="outline-container-org216dea5" class="outline-3">
|
||||
<h3 id="org216dea5">Structure of an assembly program :</h3>
|
||||
<div class="outline-text-3" id="text-org216dea5">
|
||||
<p>
|
||||
While there is no “standard” structure, i prefer to go with this one :
|
||||
</p>
|
||||
|
@ -276,9 +276,9 @@ While there is no “standard” structure, i prefer to go with this one
|
|||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div id="outline-container-orgc1466ec" class="outline-3">
|
||||
<h3 id="orgc1466ec">MOV dst, src</h3>
|
||||
<div class="outline-text-3" id="text-orgc1466ec">
|
||||
<div id="outline-container-orgdf5852b" class="outline-3">
|
||||
<h3 id="orgdf5852b">MOV dst, src</h3>
|
||||
<div class="outline-text-3" id="text-orgdf5852b">
|
||||
<p>
|
||||
The MOV instruction copies the Second operand (src) to the First operand (dst)… The source can be a memory location, an immediate value, a general-purpose register (AX BX CX DX). As for the Destination, it can be a general-purpose register or a memory location.
|
||||
</p>
|
||||
|
@ -327,13 +327,13 @@ for segment registers only these types of MOV are supported:
|
|||
<b>memory</b>: [BX], [BX+SI+7], variable
|
||||
</p>
|
||||
</div>
|
||||
<div id="outline-container-org508d45b" class="outline-4">
|
||||
<h4 id="org508d45b">Note : The MOV instruction <b>cannot</b> be used to set the value of the CS and IP registers</h4>
|
||||
<div id="outline-container-orgef8aa84" class="outline-4">
|
||||
<h4 id="orgef8aa84">Note : The MOV instruction <b>cannot</b> be used to set the value of the CS and IP registers</h4>
|
||||
</div>
|
||||
</div>
|
||||
<div id="outline-container-orgb475e10" class="outline-3">
|
||||
<h3 id="orgb475e10">Variables :</h3>
|
||||
<div class="outline-text-3" id="text-orgb475e10">
|
||||
<div id="outline-container-org3486b9c" class="outline-3">
|
||||
<h3 id="org3486b9c">Variables :</h3>
|
||||
<div class="outline-text-3" id="text-org3486b9c">
|
||||
<p>
|
||||
Let’s say you want to use a specific value multiple times in your code, do you prefer to call it using something like <b>var1</b> or <b>E4F9:0011</b> ? If your answer is the second option, you can gladly skip this section, or even better, seek therapy.
|
||||
</p>
|
||||
|
@ -353,9 +353,9 @@ Anyways, we have two types of variables, <b>bytes</b> and <b>words(which are two
|
|||
<b>value</b> - can be any numeric value in any supported numbering system (hexadecimal, binary, or decimal), or “?” symbol for variables that are not initialized.
|
||||
</p>
|
||||
</div>
|
||||
<div id="outline-container-org3d5d0c5" class="outline-4">
|
||||
<h4 id="org3d5d0c5">Example code :</h4>
|
||||
<div class="outline-text-4" id="text-org3d5d0c5">
|
||||
<div id="outline-container-org89e3a28" class="outline-4">
|
||||
<h4 id="org89e3a28">Example code :</h4>
|
||||
<div class="outline-text-4" id="text-org89e3a28">
|
||||
<div class="org-src-container">
|
||||
<pre class="src src-asm"> <span style="color: #cba6f7;">org</span> 100h
|
||||
<span style="color: #cba6f7;">.data</span>
|
||||
|
@ -369,9 +369,9 @@ Anyways, we have two types of variables, <b>bytes</b> and <b>words(which are two
|
|||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div id="outline-container-orgd4e5244" class="outline-4">
|
||||
<h4 id="orgd4e5244">Arrays :</h4>
|
||||
<div class="outline-text-4" id="text-orgd4e5244">
|
||||
<div id="outline-container-org6f2214f" class="outline-4">
|
||||
<h4 id="org6f2214f">Arrays :</h4>
|
||||
<div class="outline-text-4" id="text-org6f2214f">
|
||||
<p>
|
||||
We can also define Arrays instead of single values using comma separated vaues. like this for example
|
||||
</p>
|
||||
|
@ -432,12 +432,89 @@ Of course, you can use DW instead of DB if it’s required to keep values la
|
|||
</p>
|
||||
</div>
|
||||
</div>
|
||||
<div id="outline-container-org4f06cd3" class="outline-4">
|
||||
<h4 id="org4f06cd3">LEA</h4>
|
||||
<div class="outline-text-4" id="text-org4f06cd3">
|
||||
<p>
|
||||
LEA stands for (Load Effective Address) is an instruction used to get the offset of a specific variable. We will see later how its used, but first. here is something we will need :
|
||||
</p>
|
||||
|
||||
<p>
|
||||
In order to tell the compiler about data type,
|
||||
these prefixes should be used:
|
||||
</p>
|
||||
|
||||
<p>
|
||||
<b>BYTE PTR</b> - for byte.
|
||||
<b>WORD PTR</b> - for word (two bytes).
|
||||
</p>
|
||||
|
||||
<p>
|
||||
For example:
|
||||
<b>BYTE PTR [BX]</b> ; byte access.
|
||||
or
|
||||
<b>WORD PTR [BX]</b> ; word access.
|
||||
assembler supports shorter prefixes as well:
|
||||
</p>
|
||||
|
||||
<ol class="org-ol">
|
||||
<li>- for BYTE PTR</li>
|
||||
<li>- for WORD PTR</li>
|
||||
</ol>
|
||||
|
||||
<p>
|
||||
in certain cases the assembler can calculate the data type automatically.
|
||||
</p>
|
||||
</div>
|
||||
<ul class="org-ul">
|
||||
<li><a id="org008d51a"></a>Example :<br />
|
||||
<div class="outline-text-5" id="text-org008d51a">
|
||||
<div class="org-src-container">
|
||||
<pre class="src src-asm"> <span style="color: #cba6f7;">org</span> 100h
|
||||
<span style="color: #cba6f7;">.data</span>
|
||||
<span style="color: #cba6f7;">VAR1</span> db 50h
|
||||
<span style="color: #cba6f7;">VAR2</span> dw 1234h
|
||||
<span style="color: #cba6f7;">.code</span>
|
||||
<span style="color: #cba6f7;">MOV</span> AL, VAR1 <span style="color: #6c7086;">; </span><span style="color: #6c7086;">We check the value of VAR1 by putting it in AL</span>
|
||||
<span style="color: #cba6f7;">MOV</span> AX, VAR2 <span style="color: #6c7086;">; </span><span style="color: #6c7086;">Same here</span>
|
||||
<span style="color: #cba6f7;">LEA</span> BX, VAR1 <span style="color: #6c7086;">; </span><span style="color: #6c7086;">BX receives the Address of VAR1</span>
|
||||
<span style="color: #cba6f7;">MOV</span> b.[BX], 44h
|
||||
<span style="color: #cba6f7;">MOV</span> AL, VAR1 <span style="color: #6c7086;">; </span><span style="color: #6c7086;">We effectively changed the content of the VAR1 variable</span>
|
||||
<span style="color: #cba6f7;">LEA</span> BX, VAR2
|
||||
<span style="color: #cba6f7;">MOV</span> w.[BX], 5678h
|
||||
<span style="color: #cba6f7;">MOV</span> AX, VAR2
|
||||
</pre>
|
||||
</div>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div id="outline-container-org788d7c2" class="outline-4">
|
||||
<h4 id="org788d7c2">Constants :</h4>
|
||||
<div class="outline-text-4" id="text-org788d7c2">
|
||||
<p>
|
||||
Constants in Assembly only exist until the code is assembled, meaning that if you disassemble your code later, you wont see your constant definitions.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Defining constants is pretty straight forward :
|
||||
</p>
|
||||
<div class="org-src-container">
|
||||
<pre class="src src-asm"> <span style="color: #cba6f7;">name</span> EQU value
|
||||
</pre>
|
||||
</div>
|
||||
|
||||
<p>
|
||||
Of course constants cant be changed, and aren’t stored in memory. So they are like little macros that live in your code.
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div id="postamble" class="status">
|
||||
<p class="author">Author: Crystal</p>
|
||||
<p class="date">Created: 2024-03-22 Fri 14:08</p>
|
||||
<p class="date">Created: 2024-03-23 Sat 15:57</p>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
|
|
|
@ -241,3 +241,48 @@ d DB 5 DUP(1, 2)
|
|||
d DB 1, 2, 1, 2, 1, 2, 1, 2, 1, 2
|
||||
#+END_SRC
|
||||
Of course, you can use DW instead of DB if it's required to keep values larger then 255, or smaller then -128. DW cannot be used to declare strings.
|
||||
*** LEA
|
||||
LEA stands for (Load Effective Address) is an instruction used to get the offset of a specific variable. We will see later how its used, but first. here is something we will need :
|
||||
|
||||
In order to tell the compiler about data type,
|
||||
these prefixes should be used:
|
||||
|
||||
*BYTE PTR* - for byte.
|
||||
*WORD PTR* - for word (two bytes).
|
||||
|
||||
For example:
|
||||
*BYTE PTR [BX]* ; byte access.
|
||||
or
|
||||
*WORD PTR [BX]* ; word access.
|
||||
assembler supports shorter prefixes as well:
|
||||
|
||||
b. - for BYTE PTR
|
||||
w. - for WORD PTR
|
||||
|
||||
in certain cases the assembler can calculate the data type automatically.
|
||||
|
||||
**** Example :
|
||||
#+BEGIN_SRC asm
|
||||
org 100h
|
||||
.data
|
||||
VAR1 db 50h
|
||||
VAR2 dw 1234h
|
||||
.code
|
||||
MOV AL, VAR1 ; We check the value of VAR1 by putting it in AL
|
||||
MOV AX, VAR2 ; Same here
|
||||
LEA BX, VAR1 ; BX receives the Address of VAR1
|
||||
MOV b.[BX], 44h
|
||||
MOV AL, VAR1 ; We effectively changed the content of the VAR1 variable
|
||||
LEA BX, VAR2
|
||||
MOV w.[BX], 5678h
|
||||
MOV AX, VAR2
|
||||
#+END_SRC
|
||||
*** Constants :
|
||||
Constants in Assembly only exist until the code is assembled, meaning that if you disassemble your code later, you wont see your constant definitions.
|
||||
|
||||
Defining constants is pretty straight forward :
|
||||
#+BEGIN_SRC asm
|
||||
name EQU value
|
||||
#+END_SRC
|
||||
|
||||
Of course constants cant be changed, and aren't stored in memory. So they are like little macros that live in your code.
|
||||
|
|
Loading…
Reference in New Issue
Block a user