Windows' Hello World in x86_64
Windows needs some assembly love

I'm an old programmer who loves to tinker with old things.
So I've written about how to do Hello World in Linux and macOS. You might need to check it out.
Toolchain
For this article, I will still use GNU Assembler as our tool of choice. We'll be using MINGW-W64 to build our hello world project. MingW is a GCC distribution to create a windows application. You can install it on Windows, macOS, or Linux.
To install you can install in on Linux or Mac using your package manager. For Ubuntu and Debian
# Ubuntu & Debian
sudo apt install mingw-w64
For other linux distributions, you can look at the manual of the respective package managers.
Writing the application
With Windows, it'll be little different. I'll be creating two programs. First it's the same as the command line tools we've been written for Linux and macOS. The other one is we'll be using pure Windows API to show "Hello World" in message box.
Also same with macOS, our code will be Position Independent Code, so we'll use relative addressing with RIP as the base address.
Command Line Style Hello World
With Linux and macOS we can use the syscall assembly instructions. In Windows, we cannot as the number changes between versions. You can see system call numbers in Windows here: https://j00ru.vexillium.org/syscalls/nt/64/. To print "Hello, World!" we'd need to call something to write into stdout and also to exit the program. In Linux it was system call number 1 write and 60 exit. In Windows, system calls aren't stable across versions, so we'll sue Windows API
- We'd need to get handle to stdout by calling up GetStdHandle with
-11as the value of the parameter. - To be able to write on
stdoutwe'd need to call WriteFile function from the Windows API. - And after everything is done we'll call ExitProcess.
They are normal function calls, in which we'd need to refer to Windows x64 ABI and calling convention. Some important points are
Integer valued arguments in the leftmost four positions are passed in left-to-right order in RCX, RDX, R8, and R9, respectively. The fifth and higher arguments are passed on the stack as previously described.
These registers, and RAX, R10, R11, XMM4, and XMM5, are considered volatile
The shadow space is the mandatory 32 bytes (4x8 bytes) you must reserve for the called procedure. The address of our entry point is added to the stack, so we'd need to allocate shadow space of 40 bytes. (32 + 8 byte address).
I think this is enough to deduce how to write Hello World in Windows. We can start with usual preamble with .rodata defined.
.code64
.section .rodata
msg: .ascii "Hello, World!\n"
.set msglen, (. - msg)
.section .text
.global _start
_start:
sub $40, %rsp
add $40, %rsp
ret $0
We'd need to call Windows APIs: GetStdHandle, WriteFile, and ExitProcess. So we'll declare them as .extern.
.extern GetStdHandle
.extern WriteFile
.extern ExitProcess
Getting Standard Out Handle
The first task is to get handle for stdout. For this we'll declare a value STD_OUTPUT_HANDLE as documented in Windows API and assign value -11.
.set STD_OUTPUT_HANDLE, -11
Don't forget to add new section .data to save the result of the call
.section .data
stdout: .long 0
and below start, we pass it as the first argument of GetStdHandle and then give the return value which is saved to register RAX to stdout.
_start:
sub $40, %rsp
mov $STD_OUTPUT_HANDLE, %rcx
call GetStdHandle
mov %rax, stdout(%rip)
This is similar to calling in C code:
HANDLE stdout = GetStdHandle(STD_OUTPUT_HANDLE);
Writing to Standard Out and Exiting
We'll use WriteFile API which has this prototype
BOOL WriteFile(
[in] HANDLE hFile,
[in] LPCVOID lpBuffer,
[in] DWORD nNumberOfBytesToWrite,
[out, optional] LPDWORD lpNumberOfBytesWritten,
[in, out, optional] LPOVERLAPPED lpOverlapped
);
Calling it with assembly means that we'll pass the first four parameters to RCX, RDX, R8, and R9 and then fifth parameter to the stack. With that we can see that.
hFileis the result ofGetStdHandlewe'll pass it out toRCX.lpBufferis our message, so we'll load the address toRDX.nNumberOfBytesToWriteis our message length. We'll load it toR8.lpNumberOfBytesWrittenis an address where we'll capture the bytes written, we'll define a location indatasection and then pass this toR9lpOverlappedis an indicator whether we'll want Overlapped IO, which we'll answer with pushing0to the stack.
So, we'll define bytes_written as a location in .data section.
bytes_written: .long 0
Then we'll call the function
mov stdout(%rip), rcx
lea msg(%rip), %rdx
mov %msglen, %r8
lea bytes_written(%rip), %r9
push $0
call WriteFile
Last but not least we exit the process and returning 0
xor %rcx, %rcx
call ExitProcess
ret
The full source code will be:
.code64
.section .rodata
msg: .ascii "Hello, World!\n"
.set msglen, (. - msg)
.extern GetStdHandle
.extern WriteFile
.extern ExitProcess
.set STD_OUTPUT_HANDLE, -11
.section .data
stdout: .long 0
bytes_written: .long 0
.section .text
.global _start
_start:
sub $40, %rsp
mov $STD_OUTPUT_HANDLE, %rcx
call GetStdHandle
mov %rax, stdout(%rip)
mov stdout(%rip), %rcx
lea msg(%rip), %rdx
mov $msglen, %r8
lea bytes_written(%rip), %r9
push $0
call WriteFile
xor %rcx, %rcx
call ExitProcess
add $40, %rsp
ret $0
We then assemble and link the file. Let's say we use the name main.S.
x86_64-w64-mingw32-as main.S -o main.o
x86_64-w64-mingw32-ld main.o -entry=_start -subsystem=console -lkernel32 -o hello.exe
Parameters -entry on the linker signify the entry point of the program -subsystem=console means that we want to build this for command line entireface. -lkernel32 means that we will need to link to KERNEL32.DLL the library of Windows which provides GetStdHandle, WriteFile, and ExitProcess.
This is the result running in Windows 10

Message Box Style Hello World:
This style of hello world will show a message box instead of showing hello world in console. For that to happen, we'll invoke MessageBoxA API instead of WriteFile.MessageBoxA` is a Windows API to show a message box with ASCII character set. The prototype is as follows:
int MessageBoxA(
[in, optional] HWND hWnd,
[in, optional] LPCSTR lpText,
[in, optional] LPCSTR lpCaption,
[in] UINT uType
);
As you can see, it accept LPCSTR which literally means Long Pointer to C String. What does it means? It means it needs a zero-terminated string. We'd need to change our msg declaration to this:
msg: .asciz "Hello, World!"
We change .ascii directive to .asciz which will terminate the string with NULL character. I will also define another symbol for the caption
caption: .asciz "Hello!"
The last parameter is uType, we'll use the constant MB_OK and MB_ICONINFORMATION by defining it as a .set symbols:
.set MB_OK, 0
.set MB_ICONINFORMATION, 0x40
Calling the function is as easy as just loading those to registers.
hWndwill be0orNULLas this is the parent window, we pass toRCX.lpTextwill be the text. We load toRDX.lpCaptionwill be the caption of the dialog. We load them toR8.uTypewill be the or betweenMB_OKandMB_ICONINFORMATIONand will be loaded toR9.
Therefore the call will be:
xor %rcx, %rcx
lea msg(%rip), %rdx
lea caption(%rip), %r8
mov $(MB_OK | MB_ICONINFORMATION), %r9
call MessageBoxA
The result of the call will be saved to RAX, but we won't use it. The exit instructions are the same so the full source will be:
.code64
.section .rodata
msg: .asciz "Hello, World!"
caption: .asciz "Hello!"
.set MB_OK, 0
.set MB_ICONINFORMATION, 0x40
.extern MessageBoxA
.extern ExitProcess
.section .text
.global _start
_start:
sub $40, %rsp
xor %rcx, %rcx
lea msg(%rip), %rdx
lea caption(%rip), %r8
mov $(MB_OK | MB_ICONINFORMATION), %r9
call MessageBoxA
xor %rcx, %rcx
call ExitProcess
add $40, %rsp
ret $0
The assembling and linking process is pretty much the same:
x86_64-w64-mingw32-as main.S -o main.o
x86_64-w64-mingw32-ld main.o -o hello.exe \
-entry=_start -subsystem=windows \
-luser32 -lkernel32
Except for the linking part when we use -subsystem=windows to say that we don't want console, and also we'd need to link to USER32.DLL by adding -luser32. And here's the result

Summary
So here we are we have built a modern 64-bit hello world application in Linux, macOS, and Windows using GNU Assembler syntax. You may also interested on my 16-bit Windows tutorial on youtube






