intro
RoureXOS is basically a very primitive 16-bit DOS-like operating system (OS) for the floppy disk running just above the BIOS. This article will show its basic concepts, functionalities, and challenges provided with screenshots and more.
The article is being written for several months now, and it still is in the process of adding and editing. Therefore, various topics include the system version(s), that were actual in the time of writing such section or snippet. The thumbnails part below contains figures/screenshots of the booted system, the content is both in the Czech (legacy), and English languages. Those are mainly screenshots from the QEMU virtual machine (+ linux terminal + bochs
).
brief history
Project RoureXOS had been developed in the years 2010–2012; however it is being resurrected nowadays (2023–now) using Windows XP image run in VirtualBox — old Windows NT version is needed to run TASM, TCC and JLOC tools (to be replaced later hopefully). The project’s name comes from its origin as Rouring.net organization took the development part. rourex
name was used for a small CLI prompt project, thus the name for the operating system: rourexos
.
One has to mention the main source code (mainly the bootloader and kernel skeleton) and inspiration source: DjH @ soom.cz [and his AltairOS project there (CZ)]. [8] [9]
architecture
bootloader
The system itself is relatively tiny, it consists of a prefabricated 16-bit Real Mode stage 0 bootloader, which initializes FAT12 filesystem structure and tree, check the floppy disk, and finally finds and loads the kernel. The bootloader is written in NASM assembly. [5]
Since v0.8.0, the bootloader is way simpler as rfs (rourex file system) is being developed and the kernel reading and loading is hardcoded at the moment.
kernel
Kernel is written both in x86 assembly (NASM; to be linked with the C kernel) and in Turbo C with Turbo Assembler (TASM) integration — inline assembly. Since TASM is executable only under Windows, Windows XP image is used to compile and build the project. Kernel is compiled and built as ROUREXS.COM
file. (DOSBox can be used for compiling too, but is way slower than running the process in Windows virtual machine.)
[9]
At the moment (v0.10.0), the kernel is still monolithic (around 34 kB) and takes significant part of the memory allocation (one Real Mode memory segment is 64 kB at max). The main goal here is to break the kernel into a shell interpreter, small tooling apps and kernel modules. [7]
Since v0.11.0 there is a partially functioning external program load-and-execute procedure, which allows the kernel to be smaller while divided into small (around 2–5 kB) tools.
It is planned to use the A20 gate hack to access High Memory segment(s) too. [3]
disk image
The final step is to create a bootable floppy image (RoureXOS.img
, 1.44 MB), which can be deployed to a floppy diskette. The kernel is inserted into the floppy image with other files like NAVOD.TXT
(something like a README file), USER.SYS
(user name for console login), and PASS.SYS
(plaintexted password).
Since the rfs is being used, the image contents is more sparse as the kernel location is hardcoded (due the lack of a rfs driver for GNU/Linux).
functionalities
base prompt
Fig. 1: System booting and kernel starting.
Fig. 2: Basic prompt commands shown — ver
, dir
, help
. The third subfigure shows extended command list as of v0.9.9
console login and TUI (legacy)
Fig. 3: RoureXOS console “GUI” (TUI) login dialogue window.
Fig. 4: Console UI after login — clean “desktop” with files listed and actual time shown.
Fig. 5: Console UI menu “window”.
Note: This section was meant to introduce some OS’ functionalities and to show the look and base prompt. Next sections go deeper in the technical detail(s), as well as they present more functionalities not mentioned before. In some sections, progress in time can be seen too (some theory, implementation, testing, debugging, etc).
serial link
See more on serial link and modem tuning here.
For the purposes of serial port communication testing, QEMU has been configured to allow direct access to the host machine’s ttyUSB0 device.
|
|
Note: Attach USB serial port adapter to virtual machine, use host RTC (time clock), and emulate a floppy disk drive (and attach it to machine with a virtual diskette as IMG_FILE
).
To check port status, the port
command can be used. Even a port signal changes are feasible to detect — see Fig. 6.
Fig. 6: Serial port change detection. Modem and line status according to BIOS. Connection indication in prompt from v0.7.9.
communication with a modem
For testing purposes (from v0.7.3), there is a command kom
to open/attach a communication tunnel with the counterpart (a dial-up modem mostly). The active connection can be left intact, and continue at the base prompt; then one can attach the session back.
caller side (rourexos)
In RourexOS, there is a simple terminal interface (program) implemented — kom
command. This program is an external procedure to the kernel itself, or to put it more simply: it is an external program, which is loaded and executed using the system kernel.
Below are some examples on ho to get the configuration status of a modem (ati4
), and how to dial a remote counterpart (atdtXXX
).
Fig. 7: Serial link communication with modem. QEMU attached machine (passthrough serial device).
counterpart side (linux)
Fig. 8: Terminal attached to the other counterpart serial link to modem. RoureXOS (virtualized) with serial link is attached to the serial port too, with a modem connected as well. RoureXOS starts a dial, counterpart (green screen) answers and negotiation takes place. After that the link is set as Layer1 link. The last line’s string is sent from the RoureXOS instance over the line.
ASCII terminal session over the dial-up link
Note: opsidian is a server with a modem connected via USB adapter, acts like the dial-in server, and telnet gateway via mgetty.
Fig. 9: Executing initial make
command for swis-api project. No TERM
variable is set, so the interactive features makes the terminal unstable.
Fig. 10: Interacting with the FozzTexx’s Level 29 BBS.
Fig. 11: Czech RSS news feed parsed and transformed to Unicode for it to be viewable in rourexos
. In the way is a tiny HTTP/TELNET gateway (bbs-go TELNET server).
16bit memory model
https://www.cs.ubbcluj.ro/~vancea/asc/practic/nasm_html/nasmdoc8.html
http://www.sunshine2k.de/articles/coding/cmemalloc/cmemory.html
3.5" floppy disk
Typical floppy disk has these parameters:
- 2 heads
- 80 tracks (cylinders) per head
- 18 sectors per track
- 512 bytes per sector
|
|
Fig. 12: Cylinder-Head-Sector (CHS) architecture of a generic disk storage device. [16]
LBA to CHS
Modern addressing uses Logical Block Addressing (LBA) to utilize so-called linear addresses. The older system is called the Cylinder-Head-Sector (CHS) schema. Those schemae are interchangeable, meaning one can transfer one address into another and vice versa (see the equations below).
fresh FAT12 filesystem
|
|
rourex file system (rfs)
Originally, the OS was meant to work with FAT12-formatted diskettes/floppy binary images. Finding and reading files had been already implemented, but writing files showed to be a difficult task. When reading through FAT12 architecture and implementation posts, I got an idea of an own filesystem implementation (thus partly inpired by FAT12, File Allocation Table).
Fig. 13: Nested directories testing example screenshot (still WIP).
overview
The root directory starts on sector LBA (Logical Block Addressing) \(100\). Each new file (inc. new directories) are allocated and increasing index of sector. Meaning files’ sector numbers goes as \({101, 102, 103, [\dots], 119}\). Directories are allocated the increasing index setup too, but by hundrets, going as \({200, 300, 400, [\dots], 1900}\). Thus around 360 files can be allocated at the moment.
The system is designed mainly for floppy images, which usually work with 512-byte blocks (sectors, or clusters). As each file fills the whole single sector, metadata are stored as 32-byte headers, leaving 480 bytes for the file contents. Directories are stored as files too, but having zero data size and content at the same time. This could be a call for improvement (to implement some kind of directory table as in FAT).
Fig. 14: Filesystem entries diagram. Root directory is allocated at LBA 100 by default. Root directory has 4 files allocated within, one of a directory type. This subdirectory is then allocated both in root directory and at LBA 200. The subdirectory has allocated other two (system) files.
file types
type | value | description |
---|---|---|
T_FREE | 0x00 | free sector, default |
T_TEXT | 0x01 | text file |
T_DIR | 0x02 | directory |
T_SYS | 0x03 | system file (restricted access) |
T_PWD | 0x04 | password file (restricted access, hashed?) |
T_BAK | 0x05 | backup file |
T_LINK | 0x06 | symbolic link to a file (using sector_no2 ) |
T_DELETED | 0x07 | deleted file/entry |
32-byte entry header
Entry metadata are derived from FAT12 Directory Entry metadata scheme. [1]
byte offset | length in bytes | metadata name | description |
---|---|---|---|
0 | 8 | name | entry base name |
8 | 3 | ext | entry extension |
11 | 1 | type | entry/file type |
12 | 1 | mode | entry mode |
13 | 1 | creation_tenth | entry creation time in tenths of seconds |
14 | 2 | creation_time | entry creation time |
16 | 2 | creation_date | entry creation date |
18 | 2 | last_access_date | entry last access date |
20 | 2 | sector_no | main sector LBA address |
22 | 2 | modify_time | last modification time |
24 | 2 | modify_date | last modification date |
26 | 2 | sector_no2 | secondary sector LBA address |
28 | 4 | filesize | entry/file size |
function prototypes
|
|
[…]
COM program execution
As the 16-bit Real Mode is used, one had to make his homework on memory segmentation and deeper understanding of segment registers (segment memory address part) and general purpose registers (offset memory address part). [2] [3] [5]
“640k ought to be enough for anybody” — Bill Gates, 1981 [5]
Tab. 1: Segment registers, general purpose (GP) registers, and its link to memory segmentation. [2] [5]
segment register | GP register (offset) | memory segment |
---|---|---|
CS | IP , BX | code |
SS | SP , BP | stack |
DS | AX , SI | data |
ES | DI | code/data |
FS | code/data (i386 and higher only) | |
GS | code/data (i386 and higher only) |
Fig. 15: Segment and GP registers in Real Mode. [2]
far memory jumps (Real Mode, NASM/TASM)
Jumping to raw address requires setting the segment and GP registers to point to a new address. Far jumps (to another segment, also intersegment jump) has been complicated to implement in TASM (Turbo Assembler) as there are some limitiation present (e.g. strict memory addressing procedure). [4]
|
|
Note: those instructions are prefixed with asm
in C code — being the inline assembly example. Code is assembled using TASM and TCC (Turbo C Compiler).
[10]
COM binary disassemble
|
|
- https://gist.github.com/Tony3-sec/264677a67064f94c3946502ffbbcc6f8
- https://stackoverflow.com/questions/5125896/how-to-disassemble-a-binary-executable-in-linux-to-get-the-assembly-code
QEMU debugging
Save physical memory (pmemsave
) starting at the address 0x00000
of length of 128 kB to the file called mem.bin
.
To list the CPU registers just use this line on the compatmonitor0
QEMU tab:
[15]
|
|
custom programming
To begin with the custom programming (writing standalone binaries/programs in RoureXOS) it is vital to posses some sort of a text editor to write, read, and edit the source code.
text editor
Since v0.11.1 there is an external program called simply ED.COM
, which can be loaded by kernel as a custom text editor with its own interface and tooling. (WIP)
Fig. 16: A simple line editor prototype. (v0.11.4)
Q: Is it viable to write a simple line text editor in C and x86 assembly in 2024?
A: Yes, writing a simple line text editor in C and x86 assembly remains viable in 2024. While there are many modern tools available for text editing, developing a simple text editor in C and x86 assembly can be a useful learning experience for understanding low-level programming concepts and system interactions. Additionally, creating such a tool from scratch can provide insights into how text editors function at a foundational level.
[…]
assembler + linker
Not sure if linker would be really needed if we wanted a single binary from a single ASM file. For the assembler (parser) we would need a lookup table with the corresponding opcodes for given instruction set.
[…]
- https://en.wikipedia.org/wiki/X86_instruction_listings#x86_integer_instructions
- https://en.wikipedia.org/wiki/X86_assembly_language
interrupt handlers and API
[…]
interrupt table
Following tables act like a concept on how to organize the main API interrupt vector (0x30
).
Each table links to a special system binary (e.g. RFS2.COM
), which will be loaded at start of the system to memory. A pointer to such address containing the Interrupt Service Routine (ISR) will be hardcoded in the wrapper ISR, which will act like a switch for other external routines (SWITCH.COM
/API.COM
).
Example code of the main (wrapper) interrupt handler: [12]
file manipulation services
ISR will be loaded from RFS2.COM
binary.
[18]
interrupt vector | AH | description |
---|---|---|
0x30 | 0x10 | read file in current directory, filename in SI . buffer in BX |
0x11 | write to file in current directory, filename in SI , buffer in BX | |
0x12 | add a new file in current directory, filename in SI , buffer in BX | |
0x13 | delete a file in current directory, filename in SI | |
0x14 | load a file to program memory and execute, filename in SI |
[…]
adding the Handler to the IVT
This is fairly simple. [12]
- First, null out
ES
. - Set
AL
=interrupt number, andBL
=4h
. - Multiply
AL
byBL
, and then put the result (AX
) inBX
. - Move the
word
that is the address of the start of your interrupt handler into[es:bx]
. - Add 2 to
BX
. - Move your handler’s segment into
[es:bx]
. - Restore your original
ES
and you’re done!
Example code of the upper list: [17]
ChatGPT answer
In x86 assembly and C language, you can load a new interrupt vector routine function to the interrupt vector table in the BIOS by following these steps:
- Write your interrupt handler routine in assembly or C.
- Obtain the interrupt vector number for the specific interrupt you want to handle.
- Load the new interrupt vector by writing the segment and offset address of the routine to the corresponding entry in the interrupt vector table.
- Ensure proper memory protection if needed by marking the memory as read-only after modifying the interrupt vector table.
Remember to be cautious when directly modifying BIOS interrupt vectors, as incorrect changes can lead to system instability or crashes.
|
|
games
simple snake
There is a concept of writing simple games for RoureXOS, or to port them at least to the system somehow. Game (SNAKE.COM
) is to be stored in binary format on the floppy disk altogether with other system tools and stuff.
[14]
[…]