Add stuff

This commit is contained in:
Crystal 2024-03-07 20:49:32 +01:00
parent 5beeeadbdc
commit 2461686f0e
3 changed files with 152 additions and 103 deletions

View File

@ -3,7 +3,7 @@
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<!-- 2024-02-24 Sat 18:22 -->
<!-- 2024-03-07 Thu 20:48 -->
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>x86 Assembly from my understanding</title>
@ -23,9 +23,9 @@
<p>
Soooo this article (or maybe even a series of articles, who knows ?) will be about x86 assembly, or rather, what I understood from it and my road from the bottom-up hopefully reaching a good level of understanding
</p>
<div id="outline-container-org0804bec" class="outline-2">
<h2 id="org0804bec">Memory :</h2>
<div class="outline-text-2" id="text-org0804bec">
<div id="outline-container-orgf540874" class="outline-2">
<h2 id="orgf540874">Memory :</h2>
<div class="outline-text-2" id="text-orgf540874">
<p>
Memory is a sequence of octets (Aka 8bits) that each have a unique integer assigned to them called <b>The Effective Address (EA)</b>, in this particular CPU Architecture (the i8086), the octet is designated by a couple (A segment number, and the offset in the segment)
</p>
@ -40,9 +40,9 @@ Memory is a sequence of octets (Aka 8bits) that each have a unique integer assig
The offset and segment are encoded in 16bits, so they take a value between 0 and 65535
</p>
</div>
<div id="outline-container-org91745ea" class="outline-4">
<h4 id="org91745ea">Important :</h4>
<div class="outline-text-4" id="text-org91745ea">
<div id="outline-container-orgdfb155c" class="outline-4">
<h4 id="orgdfb155c">Important :</h4>
<div class="outline-text-4" id="text-orgdfb155c">
<p>
The relation between the Effective Address and the Segment &amp; Offset is as follow :
</p>
@ -52,8 +52,8 @@ The relation between the Effective Address and the Segment &amp; Offset is as fo
</p>
</div>
<ul class="org-ul">
<li><a id="orge330b02"></a>Example :<br />
<div class="outline-text-5" id="text-orge330b02">
<li><a id="org1aab4ca"></a>Example :<br />
<div class="outline-text-5" id="text-org1aab4ca">
<p>
Let the Physical address (Or Effective Address, these two terms are enterchangeable) <b>12345h</b> (the h refers to Hexadecimal, which can also be written like this <b>0x12345</b>), the register <b>DS = 1230h</b> and the register <b>SI = 0045h</b>, the CPU calculates the physical address by multiplying the content of the segment register <b>DS</b> by 10h (or 16) and adding the content of the register <b>SI</b>. so we get : <b>1230h x 10h + 45h = 12345h</b>
</p>
@ -66,16 +66,16 @@ Now if you are a clever one ( I know you are, since you are reading this &lt;3 )
</li>
</ul>
</div>
<div id="outline-container-org90039d0" class="outline-3">
<h3 id="org90039d0">Registers</h3>
<div class="outline-text-3" id="text-org90039d0">
<div id="outline-container-org86f9b8f" class="outline-3">
<h3 id="org86f9b8f">Registers</h3>
<div class="outline-text-3" id="text-org86f9b8f">
<p>
The 8086 CPU has 14 registers of 16bits of size. From the POV of the user, the 8086 has 3 groups of 4 registers of 16bits. One state register of 9bits and a counting program of 16bits inaccessible to the user (whatever this means).
</p>
</div>
<div id="outline-container-org758d630" class="outline-4">
<h4 id="org758d630">General Registers</h4>
<div class="outline-text-4" id="text-org758d630">
<div id="outline-container-org9fb78cf" class="outline-4">
<h4 id="org9fb78cf">General Registers</h4>
<div class="outline-text-4" id="text-org9fb78cf">
<p>
General registers contribute to arithmetic&rsquo;s and logic and addressing too.
</p>
@ -125,97 +125,126 @@ Now here are the Registers we can find in this section:
</div>
</div>
</div>
<div id="outline-container-org810d22b" class="outline-4">
<h4 id="org810d22b">Offset/Address Registers</h4>
<div class="outline-text-4" id="text-org810d22b">
</div>
<div id="outline-container-orgbdc7488" class="outline-3">
<h3 id="orgbdc7488">Addressing and registers&#x2026;again</h3>
<div class="outline-text-3" id="text-orgbdc7488">
</div>
<div id="outline-container-org81d6e8a" class="outline-4">
<h4 id="org81d6e8a">I realized what I wrote here before was almost gibberish, sooo here we go again I guess ?</h4>
<div class="outline-text-4" id="text-org81d6e8a">
<p>
<b>SP</b>: This is the stack pointer. It is of 16 bits. It points to the topmost item of the stack. If the stack is empty the stack pointer will be (FFFE)H (or 65534 in decimal). Its offset address is relative to the stack segment(SS).
</p>
<p>
<b>BP</b>: This is the base pointer. It is of 16 bits. It is primarily used in accessing parameters passed by the stack. Its offset address is relative to the stack segment(SS).
</p>
<p>
<b>SI</b>: This is the source index register. It is of 16 bits. It is used in the pointer addressing of data and as a source in some string-related operations. Its offset is relative to the data segment(DS).
</p>
<p>
<b>DI</b>: This is the destination index register. It is of 16 bits. It is used in the pointer addressing of data and as a destination in some string-related operations. Its offset is relative to the extra segment(ES).
Well lets take a step back to the notion of effective addresses VS relative ones.
</p>
</div>
</div>
<div id="outline-container-orgfd6556c" class="outline-4">
<h4 id="orgfd6556c">Segment Registers</h4>
<div class="outline-text-4" id="text-orgfd6556c">
<div id="outline-container-org7dee427" class="outline-4">
<h4 id="org7dee427">Effective = 10h x Segment + Offset . Part1</h4>
<div class="outline-text-4" id="text-org7dee427">
<p>
<b>CS</b>: Code Segment, it defines the start of the program memory, and the different addresses of the different instructions relative to CS.
</p>
<p>
<b>DS</b>: Data Segment, defines the start of the data memory where we store all data processed by the program.
</p>
<p>
<b>SS</b>: Stack Segment, or the start of the pile. The pile is a memory zone that is managed in a particular way, it&rsquo;s like a pile of plates, where we can only remove and add plates on top of the pile. Only one address register is enough to manage it, its the stack pointer SP. We say that this pile is a LIFO pile (Last IN, First OUT)
</p>
<p>
<b>EX</b>: The start of an auxiliary segment for data
When trying to access a specific memory space, we use this annotation <b>[Segment:Offset]</b>, so for example, and assuming <b>DS = 0100h</b>. We want to write the value <b>0x0005</b> to the memory space defined by the physical address <b>1234h</b>, what do we do ?
</p>
</div>
</div>
</div>
<div id="outline-container-orgb663ae9" class="outline-3">
<h3 id="orgb663ae9">The format of an address:</h3>
<div class="outline-text-3" id="text-orgb663ae9">
<p>
An Address must have this fellowing form [RS : RO] with the following possibilities:
</p>
<ul class="org-ul">
<li>A value : Nothing</li>
<li>ES : DI</li>
<li>CS : SI</li>
<li>ES : BP</li>
<li>DS : BX</li>
<li><a id="org0f12415"></a>Answer :<br />
<div class="outline-text-5" id="text-org0f12415">
<div class="org-src-container">
<pre class="src src-asm"><span style="color: #89b4fa;">MOV</span> [DS:0234h], 0x0005
</pre>
</div>
<p>
Why ? Let&rsquo;s break it down :
<img src="../../../gifs/lain-dance.gif" alt="lain-dance.gif" />
</p>
<p>
We Already know that <b>Effective = 10h x Segment + Offset</b>, So here we have : <b>1234h = 10h x DS + Offset</b>, we already know that <b>DS = 0100h</b>, we end up with this simple equation <b>1234h = 1000h + Offset</b>, therefor the Offset is <b>0234h</b>
</p>
<p>
Simple, right ?, now for another example
</p>
</div>
</li>
</ul>
</div>
<div id="outline-container-orgc26de48" class="outline-4">
<h4 id="orgc26de48">Note 1 :</h4>
<div class="outline-text-4" id="text-orgc26de48">
<div id="outline-container-org757ac64" class="outline-4">
<h4 id="org757ac64">Another example :</h4>
<div class="outline-text-4" id="text-org757ac64">
<p>
When the register isn&rsquo;t specified. the CPU adds it depending on the offset used :
What if we now have this instruction ?
</p>
<div class="org-src-container">
<pre class="src src-asm"> <span style="color: #cba6f7;">MOV</span> [0234h], 0x0005
</pre>
</div>
<p>
What does it do ? You might or might not be surprised that it does the exact same thing as the other snipped of code, why though ? Because apparently and for some odd reason I don&rsquo;t know, the compiler Implicitly assumes that the segment used is the <b>DS</b> one. So if you don&rsquo;t specify a register( we will get to this later ), or a segment. Then the offset is considered an offset with a DS segment.
</p>
</div>
</div>
<div id="outline-container-org2f959c2" class="outline-4">
<h4 id="org2f959c2">Segment + Register &lt;3</h4>
<div class="outline-text-4" id="text-org2f959c2">
<p>
Consider <b>DS = 0100h</b> and <b>BX = BP = 0234h</b> and this code snippet:
</p>
<div class="org-src-container">
<pre class="src src-asm"> <span style="color: #cba6f7;">MOV</span> [BX], 0x0005 <span style="color: #6c7086;">; </span><span style="color: #a6e3a1; font-weight: bold;">NOTE</span><span style="color: #6c7086;"> : ITS NOT THE SAME AS MOV BX, 0x0005. Refer to earlier paragraphs</span>
</pre>
</div>
<p>
Well you guessed it right, it also does the same thing, but now consider this :
</p>
<div class="org-src-container">
<pre class="src src-asm"> <span style="color: #cba6f7;">MOV</span> [BP], 0x0005
</pre>
</div>
<p>
If you answered that its the same one, you are wrong. And this is because the segment used changes according to the offset as I said before in an implicit way. Here is the explicit equivalent of the two commands above:
</p>
<div class="org-src-container">
<pre class="src src-asm"> <span style="color: #cba6f7;">MOV</span> [DS:BX], 0x0005
<span style="color: #cba6f7;">MOV</span> [SS:BP], 0x0005
</pre>
</div>
<p>
The General rule of thumb is as follows :
</p>
<ul class="org-ul">
<li>If the offset is : DI SI or BX, the Segment used is DS.</li>
<li>If its BP, then the segment is SS.</li>
<li>If its BP or SP, then the segment is SS.</li>
</ul>
</div>
</div>
<div id="outline-container-orgc918fef" class="outline-4">
<h4 id="orgc918fef">Note 2 :</h4>
<div class="outline-text-4" id="text-orgc918fef">
<ul class="org-ul">
<li><a id="org0e8010b"></a>Note<br />
<div class="outline-text-5" id="text-org0e8010b">
<p>
Apparently we will assume that we are in the DS segment and only access to memory using the offset.
The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset. Also you can&rsquo;t write directly into the DS or CS segment registers, so something like
</p>
<div class="org-src-container">
<pre class="src src-asm"><span style="color: #89b4fa;">MOV</span> <span style="color: #cba6f7;">DS</span>, 0x0005 <span style="color: #6c7086;">; </span><span style="color: #6c7086;">Is INVALID</span>
<span style="color: #89b4fa;">MOV</span> <span style="color: #cba6f7;">DS</span>, AX <span style="color: #6c7086;">; </span><span style="color: #6c7086;">This one is VALID</span>
</pre>
</div>
</div>
<div id="outline-container-org4affc44" class="outline-4">
<h4 id="org4affc44">Note 3 :</h4>
<div class="outline-text-4" id="text-org4affc44">
<p>
The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset.
</p>
</div>
</li>
</ul>
</div>
</div>
</div>
</div>
<div id="postamble" class="status">
<p class="author">Author: Crystal</p>
<p class="date">Created: 2024-02-24 Sat 18:22</p>
<p class="date">Created: 2024-03-07 Thu 20:48</p>
</div>
</body>
</html>

BIN
src/gifs/lain-dance.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

View File

@ -68,42 +68,62 @@ LOOP
#+BEGIN_SRC asm
MUL BX (DX, AX = AX * BX)
#+END_SRC
** Addressing and registers...again
*** I realized what I wrote here before was almost gibberish, sooo here we go again I guess ?
*** Offset/Address Registers
*SP*: This is the stack pointer. It is of 16 bits. It points to the topmost item of the stack. If the stack is empty the stack pointer will be (FFFE)H (or 65534 in decimal). Its offset address is relative to the stack segment(SS).
Well lets take a step back to the notion of effective addresses VS relative ones.
*** Effective = 10h x Segment + Offset . Part1
When trying to access a specific memory space, we use this annotation *[Segment:Offset]*, so for example, and assuming *DS = 0100h*. We want to write the value *0x0005* to the memory space defined by the physical address *1234h*, what do we do ?
**** Answer :
#+BEGIN_SRC asm
MOV [DS:0234h], 0x0005
#+END_SRC
*BP*: This is the base pointer. It is of 16 bits. It is primarily used in accessing parameters passed by the stack. Its offset address is relative to the stack segment(SS).
Why ? Let's break it down :
[[../../../gifs/lain-dance.gif]]
*SI*: This is the source index register. It is of 16 bits. It is used in the pointer addressing of data and as a source in some string-related operations. Its offset is relative to the data segment(DS).
*DI*: This is the destination index register. It is of 16 bits. It is used in the pointer addressing of data and as a destination in some string-related operations. Its offset is relative to the extra segment(ES).
*** Segment Registers
*CS*: Code Segment, it defines the start of the program memory, and the different addresses of the different instructions relative to CS.
We Already know that *Effective = 10h x Segment + Offset*, So here we have : *1234h = 10h x DS + Offset*, we already know that *DS = 0100h*, we end up with this simple equation *1234h = 1000h + Offset*, therefor the Offset is *0234h*
*DS*: Data Segment, defines the start of the data memory where we store all data processed by the program.
*SS*: Stack Segment, or the start of the pile. The pile is a memory zone that is managed in a particular way, it's like a pile of plates, where we can only remove and add plates on top of the pile. Only one address register is enough to manage it, its the stack pointer SP. We say that this pile is a LIFO pile (Last IN, First OUT)
Simple, right ?, now for another example
*** Another example :
What if we now have this instruction ?
#+BEGIN_SRC asm
MOV [0234h], 0x0005
#+END_SRC
What does it do ? You might or might not be surprised that it does the exact same thing as the other snipped of code, why though ? Because apparently and for some odd reason I don't know, the compiler Implicitly assumes that the segment used is the *DS* one. So if you don't specify a register( we will get to this later ), or a segment. Then the offset is considered an offset with a DS segment.
*EX*: The start of an auxiliary segment for data
** The format of an address:
An Address must have this fellowing form [RS : RO] with the following possibilities:
- A value : Nothing
- ES : DI
- CS : SI
- ES : BP
- DS : BX
*** Segment + Register <3
*** Note 1 :
When the register isn't specified. the CPU adds it depending on the offset used :
Consider *DS = 0100h* and *BX = BP = 0234h* and this code snippet:
#+BEGIN_SRC asm
MOV [BX], 0x0005 ; NOTE : ITS NOT THE SAME AS MOV BX, 0x0005. Refer to earlier paragraphs
#+END_SRC
Well you guessed it right, it also does the same thing, but now consider this :
#+BEGIN_SRC asm
MOV [BP], 0x0005
#+END_SRC
If you answered that its the same one, you are wrong. And this is because the segment used changes according to the offset as I said before in an implicit way. Here is the explicit equivalent of the two commands above:
#+BEGIN_SRC asm
MOV [DS:BX], 0x0005
MOV [SS:BP], 0x0005
#+END_SRC
The General rule of thumb is as follows :
- If the offset is : DI SI or BX, the Segment used is DS.
- If its BP, then the segment is SS.
- If its BP or SP, then the segment is SS.
*** Note 2 :
Apparently we will assume that we are in the DS segment and only access to memory using the offset.
*** Note 3 :
The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset.
**** Note
The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset. Also you can't write directly into the DS or CS segment registers, so something like
#+BEGIN_SRC asm
MOV DS, 0x0005 ; Is INVALID
MOV DS, AX ; This one is VALID
#+END_SRC