RoureXOS operating system

intro

RoureXOS is basically a very primitive 16-bit DOS-like operating system (OS) for the floppy disk running just above the BIOS. This article will show its basic concepts, functionalities, and challenges provided with screenshots and more.

The article is being written for several months now, and it still is in the process of adding and editing. Therefore, various topics include the system version(s), that were actual in the time of writing such section or snippet. The thumbnails part below contains figures/screenshots of the booted system, the content is both in the Czech (legacy), and English languages. Those are mainly screenshots from the QEMU virtual machine (+ linux terminal + bochs).

brief history

Project RoureXOS had been developed in the years 2010–2012; however it is being resurrected nowadays (2023–now) using Windows XP image run in VirtualBox — old Windows NT version is needed to run TASM, TCC and JLOC tools (to be replaced later hopefully). The project’s name comes from its origin as Rouring.net organization took the development part. rourex name was used for a small CLI prompt project, thus the name for the operating system: rourexos.

One has to mention the main source code (mainly the bootloader and kernel skeleton) and inspiration source: DjH @ soom.cz [and his AltairOS project there (CZ)]. [8] [9]

architecture

bootloader

The system itself is relatively tiny, it consists of a prefabricated 16-bit Real Mode stage 0 bootloader, which initializes FAT12 filesystem structure and tree, check the floppy disk, and finally finds and loads the kernel. The bootloader is written in NASM assembly. [5]

Since v0.8.0, the bootloader is way simpler as rfs (rourex file system) is being developed and the kernel reading and loading is hardcoded at the moment.

kernel

Kernel is written both in x86 assembly (NASM; to be linked with the C kernel) and in Turbo C with Turbo Assembler (TASM) integration — inline assembly. Since TASM is executable only under Windows, Windows XP image is used to compile and build the project. Kernel is compiled and built as ROUREXS.COM file. (DOSBox can be used for compiling too, but is way slower than running the process in Windows virtual machine.) [9]

At the moment (v0.10.0), the kernel is still monolithic (around 34 kB) and takes significant part of the memory allocation (one Real Mode memory segment is 64 kB at max). The main goal here is to break the kernel into a shell interpreter, small tooling apps and kernel modules. [7]

Since v0.11.0 there is a partially functioning external program load-and-execute procedure, which allows the kernel to be smaller while divided into small (around 2–5 kB) tools.

It is planned to use the A20 gate hack to access High Memory segment(s) too. [3]

disk image

The final step is to create a bootable floppy image (RoureXOS.img, 1.44 MB), which can be deployed to a floppy diskette. The kernel is inserted into the floppy image with other files like NAVOD.TXT (something like a README file), USER.SYS (user name for console login), and PASS.SYS (plaintexted password).

Since the rfs is being used, the image contents is more sparse as the kernel location is hardcoded (due the lack of a rfs driver for GNU/Linux).

functionalities

base prompt

Fig. 1: System booting and kernel starting.

OS prompt base OS after the boot prompt help command example Fig. 2: Basic prompt commands shown — ver, dir, help. The third subfigure shows extended command list as of v0.9.9

TUI login preview Fig. 3: RoureXOS console “GUI” (TUI) login dialogue window.

TUI desktop after log-in Fig. 4: Console UI after login — clean “desktop” with files listed and actual time shown.

TUI menu window Fig. 5: Console UI menu “window”.

Note: This section was meant to introduce some OS’ functionalities and to show the look and base prompt. Next sections go deeper in the technical detail(s), as well as they present more functionalities not mentioned before. In some sections, progress in time can be seen too (some theory, implementation, testing, debugging, etc).

serial link

See more on serial link and modem tuning here.

For the purposes of serial port communication testing, QEMU has been configured to allow direct access to the host machine’s ttyUSB0 device.

1
2
3
4
5
6
7
8
QEMU_SYSTEM_PLATFORM=qemu-system-i386
IMG_FILE=output/RoureXOS.img
$(QEMU_SYSTEM_PLATFORM) \
	-chardev serial,path=/dev/ttyUSB0,id=usbserial \
        -serial chardev:usbserial \
        -rtc base=localtime,clock=host \
        -blockdev driver=file,node-name=floppy0,filename=${IMG_FILE} \
        -device floppy,drive=floppy0

Note: Attach USB serial port adapter to virtual machine, use host RTC (time clock), and emulate a floppy disk drive (and attach it to machine with a virtual diskette as IMG_FILE).

To check port status, the port command can be used. Even a port signal changes are feasible to detect — see Fig. 6.

connection to a BBS Fig. 6: Serial port change detection. Modem and line status according to BIOS. Connection indication in prompt from v0.7.9.

communication with a modem

For testing purposes (from v0.7.3), there is a command kom to open/attach a communication tunnel with the counterpart (a dial-up modem mostly). The active connection can be left intact, and continue at the base prompt; then one can attach the session back.

caller side (rourexos)

In RourexOS, there is a simple terminal interface (program) implemented — kom command. This program is an external procedure to the kernel itself, or to put it more simply: it is an external program, which is loaded and executed using the system kernel.

Below are some examples on ho to get the configuration status of a modem (ati4), and how to dial a remote counterpart (atdtXXX).

1
2
3
4
kom

ati4
atdt8888

serial modem output Fig. 7: Serial link communication with modem. QEMU attached machine (passthrough serial device).

counterpart side (linux)

1
2
3
4
5
6
# 8bit character size, 
# one stop bit per character, 
# generate parity bit for input and expect one too)
screen /dev/ttyUSB1 9600,cs8,-cstopb,parenb

ata

serial link counterpart Fig. 8: Terminal attached to the other counterpart serial link to modem. RoureXOS (virtualized) with serial link is attached to the serial port too, with a modem connected as well. RoureXOS starts a dial, counterpart (green screen) answers and negotiation takes place. After that the link is set as Layer1 link. The last line’s string is sent from the RoureXOS instance over the line.

ASCII terminal session over the dial-up link

Note: opsidian is a server with a modem connected via USB adapter, acts like the dial-in server, and telnet gateway via mgetty.

serial terminal showing a Makefile in swapi project Fig. 9: Executing initial make command for swis-api project. No TERM variable is set, so the interactive features makes the terminal unstable.

BBS board message written using RoureXOS Fig. 10: Interacting with the FozzTexx’s Level 29 BBS.

bbs-go project news port Fig. 11: Czech RSS news feed parsed and transformed to Unicode for it to be viewable in rourexos. In the way is a tiny HTTP/TELNET gateway (bbs-go TELNET server).

16bit memory model

3.5" floppy disk

Typical floppy disk has these parameters:

2 heads
80 tracks (cylinders) per head
18 sectors per track
512 bytes per sector

`1`	`2 * 80 * 18 * 512 = 1 474 560 bytes ~ 1.44 MB`

Fig. 12: Cylinder-Head-Sector (CHS) architecture of a generic disk storage device. [16]

LBA to CHS

Modern addressing uses Logical Block Addressing (LBA) to utilize so-called linear addresses. The older system is called the Cylinder-Head-Sector (CHS) schema. Those schemae are interchangeable, meaning one can transfer one address into another and vice versa (see the equations below).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
LBA -- Logical Block Addressing
HPC -- Heads Per Cylinder/Track
SPT -- Sectors Per Track
C   -- Cylinder no.
H   -- Head no.
S   -- Sector no.

LBA = (C * HPC + H) * SPT + (S - 1)

C = LBA / (HPC * SPT)
H = (LBA / SPT) % HPC
S = (LBA % SPT) + 1

fresh FAT12 filesystem

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# make 2880 512-byte sectors/clusters
dd if=/dev/zero of=output.img bs=512 count=2880

# make FAT12 filesystem
mkfs.msdos output.img

# burn the bootloader in
dd if=tmp/boot.bin of=output.img conv=notrunc seek=0

# mount and copy kernel in
sudo mount -t msdos -o loop,fat=12,check=strict,uid=1000,gid=1000,debug output.img mnt/
cp output/files/ROUREXS.COM mnt/rourexs.com

rourex file system (rfs)

Originally, the OS was meant to work with FAT12-formatted diskettes/floppy binary images. Finding and reading files had been already implemented, but writing files showed to be a difficult task. When reading through FAT12 architecture and implementation posts, I got an idea of an own filesystem implementation (thus partly inpired by FAT12, File Allocation Table).

rourex file system nested directories concept Fig. 13: Nested directories testing example screenshot (still WIP).

overview

The root directory starts on sector LBA (Logical Block Addressing) \(100\). Each new file (inc. new directories) are allocated and increasing index of sector. Meaning files’ sector numbers goes as \({101, 102, 103, [\dots], 119}\). Directories are allocated the increasing index setup too, but by hundrets, going as \({200, 300, 400, [\dots], 1900}\). Thus around 360 files can be allocated at the moment.

The system is designed mainly for floppy images, which usually work with 512-byte blocks (sectors, or clusters). As each file fills the whole single sector, metadata are stored as 32-byte headers, leaving 480 bytes for the file contents. Directories are stored as files too, but having zero data size and content at the same time. This could be a call for improvement (to implement some kind of directory table as in FAT).

rourex file system’s entry diagram Fig. 14: Filesystem entries diagram. Root directory is allocated at LBA 100 by default. Root directory has 4 files allocated within, one of a directory type. This subdirectory is then allocated both in root directory and at LBA 200. The subdirectory has allocated other two (system) files.

file types

type	value	description
`T_FREE`	0x00	free sector, default
`T_TEXT`	0x01	text file
`T_DIR`	0x02	directory
`T_SYS`	0x03	system file (restricted access)
`T_PWD`	0x04	password file (restricted access, hashed?)
`T_BAK`	0x05	backup file
`T_LINK`	0x06	symbolic link to a file (using `sector_no2`)
`T_DELETED`	0x07	deleted file/entry

32-byte entry header

Entry metadata are derived from FAT12 Directory Entry metadata scheme. [1]

byte offset	length in bytes	metadata name	description
0	8	`name`	entry base name
8	3	`ext`	entry extension
11	1	`type`	entry/file type
12	1	`mode`	entry mode
13	1	`creation_tenth`	entry creation time in tenths of seconds
14	2	`creation_time`	entry creation time
16	2	`creation_date`	entry creation date
18	2	`last_access_date`	entry last access date
20	2	`sector_no`	main sector LBA address
22	2	`modify_time`	last modification time
24	2	`modify_date`	last modification date
26	2	`sector_no2`	secondary sector LBA address
28	4	`filesize`	entry/file size

function prototypes

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
/*
 *  f_open()
 *  opens <name> file with <mode> mode, returns pointer to <entry> Entry
 *  http://elm-chan.org/fsw/ff/doc/open.html
 *
 *  modes:
 *  M_READ, M_WRITE, M_EXECUTE
 */
F_STATUS f_open(Entry *entry, char *name, byte mode);

/*
 *  f_close()
 *  closes <file> Entry, writes F_CLOSED
 */
F_STATUS f_close(Entry *entry);

/*
 *  f_list()
 *  lists entries of the current directory
 */
F_STATUS f_list();

/*
 *  f_find()
 *  finds given <filename> in the current directory
 */
BOOL f_find(Entry *entry, char *filename);

/*
 *  f_newdir()
 *  creates a new directory with <name>
 */
F_STATUS f_newdir(char *name);

/*
 *  f_newfile()
 *  creates a new file of given <name>, <ext> (extension), and <type>
 *  entry is then loaded with <buffer> content data
 */
F_STATUS f_newfile(char *name, char *ext, E_TYPE type, char *input_buff);

/*
 *  get_free_sector()
 *  browses the filesystem and finds a free sector to write
 */
int get_free_sector(E_TYPE type);

[…]

COM program execution

As the 16-bit Real Mode is used, one had to make his homework on memory segmentation and deeper understanding of segment registers (segment memory address part) and general purpose registers (offset memory address part). [2] [3] [5]

“640k ought to be enough for anybody” — Bill Gates, 1981 [5]

Tab. 1: Segment registers, general purpose (GP) registers, and its link to memory segmentation. [2] [5]

segment register	GP register (offset)	memory segment
`CS`	`IP`, `BX`	code
`SS`	`SP`, `BP`	stack
`DS`	`AX`, `SI`	data
`ES`	`DI`	code/data
`FS`		code/data (i386 and higher only)
`GS`		code/data (i386 and higher only)

Real Mode segmented memory layout Fig. 15: Segment and GP registers in Real Mode. [2]

far memory jumps (Real Mode, NASM/TASM)

Jumping to raw address requires setting the segment and GP registers to point to a new address. Far jumps (to another segment, also intersegment jump) has been complicated to implement in TASM (Turbo Assembler) as there are some limitiation present (e.g. strict memory addressing procedure). [4]

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
[BITS 16]       ; 16-bit RealMode
[EXTERN _main]  ; linker adds underscore sign to each function 
		; from the C source
[GLOBAL test]

test:
        ; call the main() function from the C source
        call _main

	; set the registers to bootsector
	mov ax, 0x0
	mov es, ax
	mov ds, ax
	mov ss, ax
	xor sp, sp

	; put the address on the stack, disable interrupts and do a far jump
	push es
	push word 0x7c00
	sti
	retf

Note: those instructions are prefixed with asm in C code — being the inline assembly example. Code is assembled using TASM and TCC (Turbo C Compiler). [10]

COM binary disassemble

`1`	`objdump -b binary -m i386 -M intel -rwC -D output/files/KOM.COM`

QEMU debugging

Save physical memory (pmemsave) starting at the address 0x00000 of length of 128 kB to the file called mem.bin.

1
2
3
4
5
# physical memory
pmemsave 0x0 128000 mem.bin

# virtual memory
memsave 0x0 128000 mem.bin

To list the CPU registers just use this line on the compatmonitor0 QEMU tab: [15]

`1`	`info registers`

custom programming

To begin with the custom programming (writing standalone binaries/programs in RoureXOS) it is vital to posses some sort of a text editor to write, read, and edit the source code.

text editor

Since v0.11.1 there is an external program called simply ED.COM, which can be loaded by kernel as a custom text editor with its own interface and tooling. (WIP)

Fig. 16: A simple line editor prototype. (v0.11.4)

Q: Is it viable to write a simple line text editor in C and x86 assembly in 2024?

A: Yes, writing a simple line text editor in C and x86 assembly remains viable in 2024. While there are many modern tools available for text editing, developing a simple text editor in C and x86 assembly can be a useful learning experience for understanding low-level programming concepts and system interactions. Additionally, creating such a tool from scratch can provide insights into how text editors function at a foundational level.

[…]

assembler + linker

Not sure if linker would be really needed if we wanted a single binary from a single ASM file. For the assembler (parser) we would need a lookup table with the corresponding opcodes for given instruction set.

[…]

interrupt handlers and API

[…]

interrupt table

Following tables act like a concept on how to organize the main API interrupt vector (0x30).

Each table links to a special system binary (e.g. RFS2.COM), which will be loaded at start of the system to memory. A pointer to such address containing the Interrupt Service Routine (ISR) will be hardcoded in the wrapper ISR, which will act like a switch for other external routines (SWITCH.COM/API.COM).

Example code of the main (wrapper) interrupt handler: [12]

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
inthandler:
   cmp ah,0
   je .ahzero
 
   cmp ah,1
   je .ahone
 
   cmp ah,2
   je .ahtwo
 
   ; ......
 
   mov si,msgBadAH
   call print_string
   cli
   hlt
 
.ahzero:
   ; Do whatever needed here
   iret

file manipulation services

ISR will be loaded from RFS2.COM binary. [18]

interrupt vector	`AH`	description
0x30	0x10	read file in current directory, filename in `SI`. buffer in `BX`
	0x11	write to file in current directory, filename in `SI`, buffer in `BX`
	0x12	add a new file in current directory, filename in `SI`, buffer in `BX`
	0x13	delete a file in current directory, filename in `SI`
	0x14	load a file to program memory and execute, filename in `SI`

[…]

adding the Handler to the IVT

This is fairly simple. [12]

First, null out ES.
Set AL=interrupt number, and BL=4h.
Multiply AL by BL, and then put the result (AX) in BX.
Move the word that is the address of the start of your interrupt handler into [es:bx].
Add 2 to BX.
Move your handler’s segment into [es:bx].
Restore your original ES and you’re done!

Example code of the upper list: [17]

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
push es
xor es, es
xor ah, ah
mov al, 0x30		; interrupt no. 0x30
mov bl, 0x04
mul bl 			; the result is in AX
mov bx, ax
mov [es:bx], 0x0200	; handler's offset 200h
add bx, 0x02
mov [es:bx], 0x0ff0	; handler's segment ff0h
pop es

ChatGPT answer

In x86 assembly and C language, you can load a new interrupt vector routine function to the interrupt vector table in the BIOS by following these steps:

Write your interrupt handler routine in assembly or C.
Obtain the interrupt vector number for the specific interrupt you want to handle.
Load the new interrupt vector by writing the segment and offset address of the routine to the corresponding entry in the interrupt vector table.
Ensure proper memory protection if needed by marking the memory as read-only after modifying the interrupt vector table.

Remember to be cautious when directly modifying BIOS interrupt vectors, as incorrect changes can lead to system instability or crashes.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include <stdint.h>

void set_interrupt_vector(uint8_t interrupt_number, uint16_t segment, uint16_t offset) {
    uint32_t vector_address = interrupt_number * 4; // Each entry is 4 bytes

    // Calculate address to store segment and offset
    uint16_t* vector_ptr = (uint16_t*)vector_address;

    // Inline assembly to store the segment and offset in memory
    __asm__ volatile (
        "movw %1, %%es:(%0) \n" // Store offset
        "movw %2, %%es:(%0+2)"  // Store segment
        :
        : "r" (vector_ptr), "r" (offset), "r" (segment)
        : "memory"
    );
}

int main() {
    uint8_t interrupt_number = 0x09;  // For example, keyboard interrupt
    uint16_t segment = 0xFFFF;        // Dummy segment
    uint16_t offset = 0x0000;         // Dummy offset

    set_interrupt_vector(interrupt_number, segment, offset);

    return 0;
}

games

simple snake

There is a concept of writing simple games for RoureXOS, or to port them at least to the system somehow. Game (SNAKE.COM) is to be stored in binary format on the floppy disk altogether with other system tools and stuff. [14]

[…]

references

reference	link
[1]	https://wiki.osdev.org/FAT#Standard_8.3_format
[2]	http://www.c-jump.com/CIS77/ASM/Memory/lecture.html
[3]	https://web.archive.org/web/20130609073242/http://www.osdever.net/tutorials/rm_addressing.php?the_id=50
[4]	http://computer-programming-forum.com/46-asm/c14eb326bc22253a.htm
[5]	http://www.cs.cmu.edu/~410-s07/p4/p4-boot.pdf
[6]	http://skocovsky.eu/paposu2008/paposu2008.pdf
[7]	https://csc-knu.github.io/sys-prog/books/Andrew%20S.%20Tanenbaum%20-%20Operating%20Systems.%20Design%20and%20Implementation.pdf
[8]	http://web.archive.org/web/20100227072513/http://www.shutdown.cz/clanky/altairos-vyvoj-operacniho-systemu/
[9]	https://www.soom.cz/clanky/490--Tvorime-OS-1dil
[10]	HEROUT, P. Učebnice jazyka C, 2.díl. 2000. České Budějovice: Kopp.
[11]	https://ostad.nit.ac.ir/payaidea/ospic/file1615.pdf
[12]	https://wiki.osdev.org/Real_mode_assembly_IV
[13]	https://www.cs.virginia.edu/~evans/cs216/guides/x86.html
[14]	https://nullprogram.com/blog/2014/12/09/
[15]	https://rip.hibariya.org/post/show-register-values-of-qemu-emulator/
[16]	https://en.wikipedia.org/wiki/Cylinder-head-sector
[17]	https://stackoverflow.com/a/40934959
[18]	https://en.wikipedia.org/wiki/DOS_API#Interrupt_vectors_used_by_DOS
[19]	https://poli.cs.vsb.cz/edu/apps/soj/down/apps-soj-skripta.pdf

intro#

brief history#

architecture#

bootloader#

kernel#

disk image#

functionalities#

base prompt#

console login and TUI (legacy)#

serial link#

communication with a modem#

caller side (rourexos)#

counterpart side (linux)#

ASCII terminal session over the dial-up link#

16bit memory model#

3.5" floppy disk#

LBA to CHS#

fresh FAT12 filesystem#

rourex file system (rfs)#

overview#

file types#

32-byte entry header#

function prototypes#

COM program execution#

far memory jumps (Real Mode, NASM/TASM)#

COM binary disassemble#

QEMU debugging#

custom programming#

text editor#

assembler + linker#

interrupt handlers and API#

interrupt table#

file manipulation services#

adding the Handler to the IVT#

ChatGPT answer#

games#

simple snake#

references#